Blog GES Unraveling Network Mysteries
Unraveling Network Mysteries: The Art of Reverse Engineering In networking, reverse engineering doesn’t mean figuring out how something was built and trying to copy it (like the German Enigma machine...
Poring over security logs for hours and hours is no way to spend an afternoon, but it’s the only way we have to really understand what happened during a security incident. Logging into one device at a time is the only way we have to find why and how are traffic was blocked somewhere in the network, but I wouldn’t trust the results.
Log information, network discovery, and a real-time understanding of network state is so important to network security that we collect vast amounts of data from many of our systems in an effort to track and memorialize everything that goes on in our networks. Collecting it is one thing, but making it useful is another matter entirely.
Configuring SNMP traps and syslog servers isn’t terribly difficult. However, being able to find the proverbial needle in the haystack can be so tedious and expensive that IT departments often forego network forensics altogether. We want extensive logs, and often we’re required to have extensive logs. But just as often, those logs sit unused, untapped, and therefore meaningless.
A typical workflow for a security incident might be to create the incident in the ticketing system, attach the relevant email chain to the ticket, copy in a link to the log server, and then close the ticket. There’s little else anyone can do and little time to do anything anyway. At least this way it’s in the system and the IT department is in compliance.
Network forensics is extremely important, but it’s extremely costly because of the skill needed and time involved. How do we collect specific data from every single device regardless of platform? How do we ensure that it’s accurate information? And what do we do with all that data once we’ve collected it? Network security is important, but network security isn’t a switch configuration or a firewall model. It’s an entire process, and it’s not easy.
I remember a vendor asking me who led our IT security team. I gave him a blank stare. In my experience, only very large organizations had dedicated IT security personnel let alone entire teams of them. I responded that we had a department of a few dozen people, and security was managed mostly by the network folks since security, to us, was about firewalls, IPS appliances, 802.1x, and other networking technologies.
But we know that network security is much more than that. It’s about visibility. It’s about the ability to identify anomalous behavior, and it’s about the ability to quickly track where and how traffic is blocked in a production network. Network engineers may know all the commands to configure logging on every platform in their environment, and we may know how to configure an ACL in our sleep. However, the process to make any use of this knowledge and information at a large scale can be more costly that the security incident itself.
Here we have a very practical example of how current trends in network automation can help the bottom line of a typical IT department struggling with budgets, staff, and day-to-day firefighting. I know a little basic Linux scripting and a little Python. That would help, and those are certainly good skills for anyone to learn. But the reality is that I’ve always been a router jockey. I’m no DevOps guru, and I suspect most network engineers aren’t either. So how can we easily leverage aspects of the DevOps paradigm such as the automation of processes, faster mean time to recovery, quality assurance and continuous delivery?
The easiest and fastest route to get there is to abstract scripting and automation processes behind intuitive software. In that way, network engineers with minimal scripting knowledge can automate the collection all sorts of information that often would require logging into devices individually and running the appropriate commands.
For example, for a network security analysis for a customer, I’d want to know exactly what types of ACLs are deployed throughout the entire network. That means that I’d have to look at every layer 3 interface they have. In a large network there are likely many devices with layer 3 terminations, so this would take so long and be so prone to error that the results are unreliable.
Mature software that abstracts that process would be an incredible tool in the hands of any security engineer. So long as it’s somewhat customizable and works with a variety of major platforms, this would decrease the time and cost it takes to perform a security audit and post-portem analysis. Mapping out all the layer 3 interfaces and the ACLs configured on them would be a matter of a few clicks of the mouse.
In addition to being able to dynamically map layer 3 topologies, this type of software needs to visually map layer 2 in order to identify ports blocked by Spanning Tree and scour ARP and MAC tables. This is incredibly tedious and time consuming to do manually, and unless a network team has the luxury of time and several DevOps engineers, it probably wouldn’t even be attempted in typical network security forensics.
Poring over security logs and manually logging into every device on a network is no way to spend an afternoon, but there’s a better way. Software that abstracts all the hard parts of network automation can quickly turn vast amounts of log files into meaningful, actionable data and dynamically pinpoint the problem areas in a sea of Ethernet. Today, the proverbial needle in the haystack is becoming that much easier to find without bankrupting your IT budget for next year.