Chapter 3. Evidence Acquisition

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3. Evidence Acquisition

“Some things are hurrying into existence, and others are hurrying out of it; and of that which is coming into existence part is already extinguished . . . In this flowing stream then, on which there is no abiding, what is there of the things which hurry by on which a man would set a high price?”

—The Meditations, by Marcus Aurelius¹

1. Thomas Bushnell, “The Meditations,” 1994, http://classics.mit.edu/Antoninus/meditations.mb.txt.

Ideally, we would like to obtain perfect-fidelity evidence, with zero impact on the environment. For copper wires, this would mean only observing changes in voltages without ever modifying them. For fiber cables, this would mean observing the quanta without ever injecting any. For radio frequency, this would mean observing RF waves without ever emitting any. In the real world, this would be equivalent to a murder investigator collecting evidence from a crime scene without leaving any new footprints.

Obviously, we don’t live in a perfect world, and we can never achieve “zero footprint.” Detectives analyzing a murder scene still cannot avoid walking on the same floor as the killer. However, network investigators can minimize the impact.

Network forensic investigators often refer to “passive” versus “active” evidence acquisition. Passive evidence acquisition is the practice of gathering forensic-quality evidence from networks without emitting data at Layer 2 and above. Traffic acquisition is often classified as passive evidence acquisition. Active or interactive evidence acquisition is the practice of collecting evidence by interacting with stations on the network. This may include logging onto network devices via the console or through a network interface, or even scanning the network ports to determine the current state.

Although the terms “passive” and “active” imply that there is a clear distinction between two categories, in reality, the impact of evidence acquisition on the environment is a continuous spectrum.

In this chapter, we discuss the types of physical media that can be leveraged to passively acquire network-based evidence and delve into popular tools and techniques for acquiring network traffic. Next, we review common interfaces used to interact with network devices. Finally, we discuss strategies for minimizing your footprint when conducting active evidence acquisition.

3.1 Physical Interception

It is possible to obtain network traffic without sending or modifying any data frames on the network. While it is never possible to have absolutely zero impact on the environment, the process of capturing (or sniffing) traffic can often be conducted with very little impact.

There are many ways to transmit data over physical media, and just as many ways to intercept it. The simplest case is a station connected to another station over a physical conduit, such as a copper cable. Voltage on copper can easily be amplified and redistributed in a one-to-many configuration. Hubs and switches are designed to extend the physical media in order to share the baseband with additional stations.

Forensic investigators can passively acquire network traffic by intercepting it as it is transmitted across cables, through the air, or through network equipment such as hubs and switches.

IP networks can be built upon a wide variety of physical media. For example, RFC 1149, “Standard for the transmission of IP datagrams on avian carriers” (subsequently extended in RFC 2549), describes the transmission of IP packets over avian carriers, as follows:²

2. D. Waitzman, “RFC 1149—Standard for the Transmission of IP Datagrams on Avian Carriers,” IETF, April 1, 1990, http://rfc-editor.org/rfc/rfc1149.txt.

Avian carriers can provide high delay, low throughput, and low altitude service. The connection topology is limited to a single point-to-point path for each carrier, used with standard carriers, but many carriers can be used without significant interference with each other, outside of early spring. [. . .]

Forensic investigators have not standardized on a method of passively sniffing traffic transmitted over avian carriers. Avian networks are fairly resistant to passive interception. Active interception techniques do exist, and typically involve large packages of breadcrumbs, as well as support staff to transcribe IP packets. The chief difficulty is route determination and interception, since dropped packets may not be recoverable. Consult your local health officials for any areas of avian influenza.

3.1.1 Cables

Cables allow for point-to-point connections between stations. The most common materials for cables are copper and fiber. Each of these can be sniffed, although the equipment and side effects vary based on the physical media.

3.1.1.1 Copper

The two most widely used types of copper cabling used in the modern era are coaxial cable and twisted pair.

• Coaxial Coaxial cable, or “coax,” consists of a single copper wire core wrapped in insulation and covered with a copper shield. This package is then sealed with an outer insulation. Since the transmission media is the single copper core, all stations on the network must negotiate the transmission and reception of signals. The benefit is that the copper core is shielded from elecromagnetic interference.

In most cases where coax is used, if you can tap the single copper core, you can access the traffic to and from all stations that share the physical medium.

• Twisted Pair (TP) TP cables contain multiple pairs of copper wires. Unlike coaxial cable, where the single copper core is shielded against electromagnetic interference by a tubular Faraday cage, in TP each pair of copper wires is twisted together in order to negate electromagnetic interference. TP wires are typically deployed in a star topology. For example, in a large enterprise, end stations are commonly connected via unshielded twisted pair (UTP) (typically CAT5) to an aggregator such as an edge switch. This means that by tapping one pair of TP wires on a switched network, you may receive traffic relating to only one end station.

If you put a commercial TP network tap inline, it can capture all voltages for all twisted pairs in the cable.

3.1.1.2 Optical

Fiber optic cables consist of thin strands of glass (or sometimes plastic) which are bundled together in order to transmit signals across a distance. Light is transmitted into the fiber at one end and travels along an optic fiber, reflecting constantly against the walls until it reaches an optical receiver at the other end. The light naturally degrades during travel, and depending on the length of the fiber optic cable, an optical regenerator may be used to amplify the light signal in transit.³

3. “Howstuffworks,” http://communication.howstuffworks.com/fiber-optic-communications/fiber-optic1.htm.

3.1.1.3 Intercepting Traffic in Cables

There are a variety of tools available for intercepting traffic in cables, including inline network taps, “vampire” taps, induction coils, and fiber optic taps. We discuss each of these in turn.

• Inline Network Taps An inline network tap is a Layer 1 device, which can be inserted inline between two physically connected network devices. The network tap will pass along the packets and also physically replicate copies to a separate port (or multiple ports). Network taps commonly have four ports: two connected inline to facilitate normal traffic, and two sniffing ports, which mirror that traffic (one for each direction). Insertion of an inline network tap typically causes a brief disruption, since the cable must be separated in order to connect the network tap inline.

Many network taps use hardware to replicate data, which allows for extremely high-fidelity packet captures. Network taps are commonly designed to require no power for passively passing packets. This reduces the risk of a network outage caused by the tap. It is possible to pass traffic to a monitoring port without using power, although most often power is required for monitoring.

Sophisticated taps exist that can do load-balancing for intrusion detection. These taps tend to be more expensive. Recently, commercial vendors have been offering filtering taps, which can filter traffic prior to replication based on VLAN tag, protocol, port, etc.

Network forensic analysts should keep in mind that every additional break in a cable is a potential point of failure; therefore inline insertion of network taps necessarily increase the risk of network disruption.

• Vampire Taps “Vampire taps” are devices that pierce the shielding of copper wires in order to provide access to the signal within. Unlike inline network taps, the cable does not need to be severed in order for a vampire tap to be installed. However, investigators should take caution: As noted by security researcher Erik Hjelmvik, “[i]nserting a vampire tap, even if done correctly, can bring down the link on a TP cable since the characteristics of the required balanced communication will be affected negatively.”⁴

4. Personal correspondence with Sherri Davidoff and Jonathan Ham.

Telecommunications engineers should be familiar with vampire tap technology, as it is standard issue with “butt kits” (telephone circuit testing equipment, typically worn on the toolbelts of telephone repair technicians). These handsets are used by inside-wiring technicians to tap into the punchdown blocks in wiring closets in order to sort through the large numbers of twisted pairs that so often go unnumbered and unlabeled.

Butt kits typically come with a combination of punchdown block adaptors and vampire taps so that you can either test a circuit the way it was wired on the block or just pierce the sheathing of any random set of wires to sample the signal. Get yours at your local hardware store!

• Induction Coils All wires conducting voltages emit various electromagnetic signals outside of the intended channel. Such electromagnetic radiation is more pronounced in unshielded wires, such as UTP, due to the lack of shielding that plastic sheathing affords. As a consequence, it is theoretically possible to introduce what is called an “induction coil” alongside such wiring in order to translate the laterally emitted signals into their original digital form.

Induction coils are devices that essentially transform the magnetism of weak signals to induce a much stronger signal in an external system. Such a device could potentially capture the throughput of a cable without the detection of the users, administrators, or owners of the wires. However, such devices are not commercially available in a way that the public can acquire in order to surreptitiously tap Cat5e and Cat6. That’s not to say that serious hobbyists couldn’t build one, or that a dedicated investigator couldn’t acquire one.

This potential attack or surveillance mechanism (take your pick) tends to get more hype than it is probably due.

• Fiber Optic Taps Inline network taps work similarly for fiber optic cables and copper cables. To place a network tap inline on a fiber optic cable, network technicians splice the optic cable and connect it to each port of a tap. This causes a network disruption.

Inline optical taps may cause noticable signal degradation. Network engineers often use tools called optical time-domain reflectometers (OTDR) to analyze and troubleshoot fiber optic cable signals. OTDRs can also be used to locate breaks in the cable, including splices inserted for taps. With OTDRs, technicans can create a baseline of the normal signal profile of a fiber optic cable, and potentially detect not only when the profile changes but where on the cable the disruption has likely occurred.

It is much more difficult to surreptitiously tap a fiber optic cable than a copper cable. Vampire taps for copper cables can pierce the insulation and physically connect with the copper wires to detect the changes in voltage within. Monitoring light traveling through glass is not so easy. Bend couplers can be used to capture stray photons and gain some information about the signal within a fiber optic cable without cutting the glass. Theoretically, similar devices might exist that can fully reconstruct the optical signal without requiring investigators or technicians to physically splice the optic cable. However, at the time of this writing, noninterference fiber optic taps do not appear to be commercially available to the public.⁵

5. “Undersea Optical Cable Cuts,” February 10, 2008, http://cryptome.info/cable-cuts.htm.

In early 2008, a series of undersea cable disruptions in the Middle East were reported. These caused Internet and voice outages and slowdowns in India, Pakistan, Egypt, Qatar, Saudi Arabia, and several other countries. After three separate incidents in which five cables were damaged, many people began speculating that the cuts were not a coincidence, but rather deliberately caused to induce economic disruptions, to cause Internet rerouting, or to cover the installation of surreptitious fiber optic network taps.

Security professional Steve Bellovin wrote:

Four failures in less than a week. Coincidence? Or enemy action? If so, who’s the enemy, and what are the enemy’s goals? You can’t have that many failures in one place—especially such a politically sensitive place—without people getting suspicious . . .

Now—the US certainly has the ability to tap undersea cables. After all, they did just that to the Soviets several decades ago. . . . That said, I don’t think it’s an NSA or Mossad operation. . . . Four failures at once will raise suspicions, and that’s the last thing you want when you’re eavesdropping on people.

If if wasn’t a direct attempt at eavesdropping, perhaps it was indirect. Several years ago, a colleague and I wrote about link-cutting attacks. In these, you cut some cables, to force traffic past a link you’re monitoring. Link-cutting for such purposes isn’t new; at the start of World War I, the British cut Germany’s overseas telegraph cable to force them to use easily-monitored links.^6,7,8

6. Bruce Schneier, “Schneier on Security: Fourth Undersea Cable Failure in Middle East,” February 5, 2008, http://www.schneier.com/blog/archives/2008/02/fourth_undersea.html.

7. “The Internet: Of Cables and Conspiracies,” The Economist, February 7, 2008, http://www.economist.com/node/10653963?story_id=10653963.

8. Steven M. Bellovin, “Underwater Fiber Cuts in the Middle East,” February 4, 2008, http://www.cs.columbia.edu/smb/blog/2008-02/2008-02-04.html.

3.1.2 Radio Frequency

Since the late 1990s, radio frequency has become an increasingly popular medium for transmission of packetized data and Internet connectivity. The Institute of Electrical and Electronics Engineers (IEEE) published a series of international standards (“802.11”) for wireless local area network (WLAN) communication. These standards specify protocols for WLAN traffic in the 2.4, 3.7, and 5 GHz frequency ranges. The term “Wi-Fi” is used to refer to certain types of RF traffic, which include the IEEE 802.11 standards.^9,10,11

9. IEEE, “Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 3: 36503700 MHz Operation in USA,” IEEE Standards Association (June 12, 2007), http://standards.ieee.org/getieee802/download/802.11-2007.pdf.

10. IEEE, “Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 3: 36503700 MHz Operation in USA” (November 6, 2008), http://standards.ieee.org/getieee802/download/802.11y-2008.pdf.

11. “fcc.gov,” http://wireless.fcc.gov/services/index.htm?job=service_home&id=3650_3700.

RF waves travel through the air, which is by nature a shared medium. As a result, WLAN traffic cannot be physically segmented in the way that switches segment traffic on a wired LAN. Because of physical media limitations, all WLAN transmissions may be observed and intercepted by all stations within range. Stations can capture the RF traffic, regardless of whether they participate in the link. This attribute makes passive acquisition of WLAN traffic very easy—both for investigators and attackers.

In the United States, the Federal Communications Commission has placed limits on the strength of emissions for stations operating in 802.11 frequency ranges, and the gain of antennae.^12,13 As a result, there are practical limitations on the distances over which stations can legally capture and receive data over 802.11 networks in the United States. However, directional transceivers can be constructed from off-the-shelf components which can dramatically increase the effective ranges. Such devices may be illegal but are difficult to detect and prevent. If a suspect is already engaged in illicit activies, there is no reason to assume they won’t go further in their criminal activities and eavesdrop or transmit at distances that exceed FCC specifications. (In South America, researcher Ermanno Pietrosemoli announced that his team had successfully transferred data via Wi-Fi over a distance of 238 miles.)^14,15

12. “EIRP Limitations for 80211 WLANs,” http://www.wi-fiplanet.com/tutorials/article.php/1428941/EIRP-Limitations-for-80211-WLANs.htm.

13. “CFR,” 2005, http://edocket.access.gpo.gov/cfr_2005/octqtr/47cfr15.249.htm.

14. Michael Kanellos, “New Wi-Fi Distance Record: 382 Kilometers,” CNET News, June 18, 2007, http://news.cnet.com/8301-10784_3-9730708-7.html?part=rss&subj=news&tag=2547-1_3-0-5.

15. David Becker, “New Wifi Record: 237 Miles,” Wired, June 2007, http://www.wired.com/gadgetlab/2007/06/w_wifi_record_2.

Why does this matter for the investigator? First, investigators should keep in mind that the target of investigation may be able to access the WLAN from a long distance, far outside the physical perimeter of a facility. Second, investigators should remember that when you connect over wireless links, your activity can potentially be monitored from a great distance.

Even when the Wi-Fi traffic is encrypted, there is commonly a single pre-shared key (PSK) for all stations. In this case, anyone who gains access to the encryption key can listen to all traffic relating to all stations (as with physical hubs). For investigators, this is helpful because local IT staff can provide authentication credentials, which facilitate monitoring. Furthermore, there are well-known flaws in common 802.11 encryption algorithms such as Wired Equivalent Privacy (WEP), which can allow investigators to circumvent or crack unknown encryption keys. Complicating matters, wireless access points (WAPs) employ a range of different standards for encryption and authentication. Some devices are even based on draft standards that have not yet been finalized (as of the time of this writing, that includes 802.11n).

It is possible to passively capture encrypted Wi-Fi traffic and decrypt it offline later using the encryption keys. Once an investigator has gained full access to unencrypted 802.11x traffic contents, this data can be analyzed in the same manner as any other unencrypted network traffic.

Regardless of whether or not Wi-Fi traffic is encrypted, investigators can gain a great deal of information by capturing and analyzing 802.11 management traffic. This information commonly includes:

• Broadcast SSIDs (and sometimes even nonbroadcast ones)

• WAP MAC addresses

• Supported encryption/authentication algorithms

• Associated client MAC addresses

• In many cases, the full Layer 3+ packet contents

In order to capture wireless traffic, forensic investigators must first have the necessary hardware. Many standard 802.11 network adapters and drivers do not include support for monitor mode, which allows the user to capture all packets on a network, not just packets destined for the host. The network adapter must also support the specific 802.11 protocol in use (i.e., 802.11a/b/g cards do not necessarily support 802.11n). Check your 802.11 network adapter’s model and read about the corresponding drivers for your operating system to find out if your adapter can be used for 802.11 passive evidence acquisition.

There are commercially available 802.11 network adapters that are specifically designed for capturing packets. These adapters include very handy features for forensic investigators, such as the ability to operate completely passively (so the investigator does not have to worry about accidentally transmitting data), connectors for extra antennae, and portable form factors such as USB. One popular model is the AirPcap USB adapter, manufactured by Riverbed Technology.¹⁶ Riverbed Technology only supports the AirPcap device on Windows operating systems, though wireless security professional Josh Wright has engineered a modified BackTrack Linux distribution with modified drivers that work with some models. (Please see Chapter 6 for more details.)

16. “Riverbed Technology—AirPcap,” Riverbed, 2010, http://www.riverbed.com/us/products/cascade/airpcap.php.

3.1.3 Hubs

A network hub is a dumb Layer 1 device that physically connects all stations on a local subnet to one circuit. A hub does not store enough state to track what is connected to it, or how. It maintains no knowledge of what devices are connected to what ports. It is merely a physical device designed to implement baseband (or “shared”) connectivity. In other words, all the devices on the local segment that the hub provides are physically connected and, therefore, share the same physical medium.

When the hub receives a frame, it retransmits it on all other ports. Therefore, every device connected to the hub physically receives all traffic destined to every other device attached to the hub. If a hub exists in the network, then you can connect to it and trivially sniff all of the traffic on the segment. A wireless access point is a special example of a hub, which we discuss in more detail in Chapter 6.

Many devices that are currently labeled as “hubs” by the manufacturer are, in fact, switches. This can be confusing to sort out. There are two indicators that can help you determine whether something labeled as a hub really is a hub. The first way is to examine the lights on the front panel. If there is an LED labeled “collision,” it is a hub. If there is no such light, it’s probably a switch. The collision light indicates that two stations have transmitted at the same time and the physical medium must reset itself. On a busy network, this can happen thousands of times per second.

The more reliable way to determine if a device is actually a hub is to connect a station to it, put the network interface into promiscuous mode, and observe the traffic with tcpdump or a similar tool. If all you see is traffic destined for your station and broadcast traffic, then the device is a switch. If it is a hub, you should see all the other traffic.

Investigators must be careful when using hubs as traffic capture devices. The investigator sees all traffic on the segment, but so can everyone else. A compromised system could trivially act as a passive listener and eavesdrop on any data transfers or communications. Any evidence transmitted across the network, or normal traffic sent by the investigator’s operating system, may be trivially captured by anyone else on the local network. It may be appropriate to take advantage of a hub that is already installed on a network, but installing a hub for the purposes of traffic capture can introduce new risks unnecessarily.

Bottom line for the investigator: if you walk into an environment with a hub, you may want to take advantage of that. However, keep in mind that anything you can see, and anything you transmit, can be seen by anyone else on the network as well. Furthermore, the design of hubs varies greatly by manufacturer, and there is no guarantee that you are actually receiving all of the traffic you expect, or that it really is a hub. Caveat emptor.

3.1.4 Switches

Switches are the most prevalent Layer 2 device. Like hubs, they also connect multiple stations together to form a LAN. Unlike hubs, switches use software to keep track of which stations are connected to which ports, in its CAM table. When a switch receives a packet, it forwards it only to the destination station’s port. Individual stations do not physically receive each other’s traffic. This means that every port on a switch is its own collision domain. Switches operate at Layer 2 (the data-link layer), and sometimes Layer 3 (the network layer).

Even the simplest switch maintains a CAM table, which stores MAC addresses with corresponding switch ports. A MAC address is an identifier assigned to each station’s network card. The purpose of the CAM table is to allow the switch to isolate traffic on a port-by-port basis so that each individual station only receives traffic that is destined for it, and not traffic destined for other computers.

Switches populate the CAM table by listening to arriving traffic. When a switch receives a frame from a device, it looks at the source MAC address and remembers the port associated with that MAC address. Later, when the switch receives a packet destined for that device, it looks up the MAC address and corresponding port in the CAM table. It then sends the packet only to the appropriate port, encapsulated with the correct Layer 2 Ethernet address.

In this way, a switch segments the traffic endpoint-by-endpoint, even while technically sharing the same physical medium.

3.1.4.1 Obtaining Traffic from Switches

Investigators can, and often do, capture network traffic using switches. Even though by default switches only send traffic to the destination port indicated in the frame, switches with sufficient software capabilities can be configured to replicate traffic from one or more ports to some other port for aggregation and analysis. Different vendors have different terminology for this capability—probably the most common term is Cisco’s Switched Port Analyzer (SPAN) and Remote Switched Port Analyzer (RSPAN). The most vendor-neutral term for this is “port mirroring.”

Switches have varying port mirroring capabilities, depending on their make and model. It is common to encounter a switch that is capable of some port mirroring, but for that capability to be limited by the number of ports that can be mirrored, or by the number of ports that can be used for aggregation and analysis.

Port mirroring is inherently limited by the physical capacity of the switch itself. For example, let’s say you have a 100Mbps switch and you attempt to mirror four ports, which are each passing an average of 50Mbps to a single SPAN port. The total amount of traffic from all four ports adds up to 200Mbps, which is far above the capacity of any one port to receive. The result is “oversubscription,” and packets will be dropped by the switch.

You need administrative access to the switch’s operating system in order to configure port mirroring. Once you have mirrored the ports of interest, you can connect a sniffer to the mirroring port and capture all of the traffic.

If you don’t have administrative access, it is still possible to sniff traffic from a switch. Let’s study the methods that attackers use. In rare cases, such as when the network administrators themselves are not trusted, an investigator may need to use the same techniques as attackers. These are not the safest methods, as they cause the switch to operate outside normal parameters, but they can work.

To sniff traffic from a switch, attackers use one of two common methods. First, the attacker can flood the switch with bogus information for the CAM table by sending it many Ethernet packets with different MAC addresses. This attack is referred to as “MAC flooding.” Once the CAM table is filled, many switches by default will “fail open,” and send all traffic for systems not in the CAM table out to every port.

Second, an attacker can conduct an “ARP spoofing” attack. Normally, the Address Resolution Protocol (ARP) is used by stations on a LAN to dynamically map IP addresses (Layer 3) to corresponding MAC addresses (see Chapter 9 for details). In an ARP spoofing attack, the attacker broadcasts bogus ARP packets, which link the attacker’s MAC address to the victim’s IP address. Other stations on the LAN add this bogus information to their ARP tables, and send traffic for the router’s IP address to the attacker’s MAC address instead. This causes all IP packets destined for the victim station to be sent instead to the attacker (who can then copy, modify, and/or forward them on to the victim).

To receive outbound communications, the attacker would again use ARP spoofing to link the attacker’s MAC address to the IP address used by the victim’s gateway. Any packets from the victim destined for the local gateway would be sent instead to the attacker, who could copy or modify them and then forward them along to the gateway.

Note that MAC flooding is an attack on the switch itself, whereas ARP spoofing or poisoning is an attempt to poison the ARP caches of all the systems on the LAN.

The safest way to obtain traffic from a switch is to coordinate with a network administrator to configure “port mirroring,” in which traffic from ports of interest is mirrored to a port that is used by the investigator.

Switches can also be attacked in several ways to try to facilitate sniffing. The most common are:

• MAC flooding (which attacks the switch’s CAM table directly)

• ARP spoofing (which attacks the ARP tables of the hosts on the LAN)

It would be hard to argue that either of these methods is really “passive,” since they require an attacker to send extensive and continuing traffic on the network. However, these are methods for facilitating traffic capture on switched networks when port mirroring or tapping a cable is not an option.

Configuration of port mirroring with switches is vendor- and device-specific. The following example shows how to configure port mirroring for a Cisco ASA 5500 (IOS 8.3), which has the hostname “ant-fw.” In this case, traffic on Ethernet ports 2, 3, 4, 5, and 6 are all copied to port 7.

ant-fw(config)# interface ethernet 0/7
ant-fw(config-if)# switchport monitor ethernet 0/6
ant-fw(config-if)# switchport monitor ethernet 0/5
ant-fw(config-if)# switchport monitor ethernet 0/4
ant-fw(config-if)# switchport monitor ethernet 0/3
ant-fw(config-if)# switchport monitor ethernet 0/2

Once the SPAN port is set up, an investigator can simply plug a forensic monitoring station into port 7 and sniff traffic copied from ports 2 through 6.

3.2 Traffic Acquisition Software

Once you gain physical access to network traffic, you need software to record it. In this section, we explore the most common software libraries used for recording, parsing, and analyzing captured packet data: libpcap and WinPcap. We also review tools based on these libraries, including tcpdump, Wireshark, and more. We delve into methods for filtering traffic during and after capture, since the importance of filtering has increased in recent years in proportion to the rising volume of network traffic.

3.2.1 libpcap and WinPcap

Libpcap is a UNIX C library that provides an API for capturing and filtering data link-layer frames from arbitrary network interfaces. It was originally developed at the Lawrence Berkeley National Laboratory (LBNL),¹⁷ and initially released to the public in June 1994.¹⁸ Different UNIX systems have different architectures for processing link-layer frames. Consequently, programmers writing a utility on UNIX to inspect or manipulate link-layer frames originally had to write operating system–specific routines for accessing them. The purpose of libpcap was to provide a layer of abstraction so that programmers could design portable packet capture and analysis tools.

17. “LBNL’s Network Research Group,” August 2009, http://ee.lbl.gov/.

18. “Libpcap,” http://www.tcpdump.org/release/libpcap-0.5.tar.gz.

In 1999, the Computer Networks Group (NetGroup) in the Politecnico di Torino published WinPcap, a library based on libpcap that was designed for Windows systems. Since then, many people and companies have contributed to the WinPcap project. The code is now hosted at a site maintained by Riverbed Technology (also the commercial sponsor of Wireshark).

The most popular packet sniffing and analysis tools today are based on the libpcap libraries. These include tcpdump, Wireshark, Snort, nmap, ngrep, and many others. Consequently, these tools are interoperable in the sense that packets captured with one tool can be read and analyzed with another. A quintessential feature of libpcap-based utilities is that they can capture packets at Layer 2 from just about any network interface device and store them in a file for later analysis. Other tools can then read in these “packet capture” or “pcap” files while filtering the traffic based on specific protocol information, perhaps writing out a more refined packet capture for yet further analysis.

Many tools based on libpcap also include specialized functionality, such as the ability to merge packet captures (i.e., mergecap), to split a capture up by TCP streams (i.e., tcpflow), or to conduct regular expression searches on packet contents (i.e., ngrep). There are, in fact, so many libpcap-based tools that we can’t hope to cover them all here. Instead, we focus on the tools that are most commonly used.

Both libpcap and WinPcap are free software released under the “BSD license,” which has been approved by the Open-Source Initiative.¹⁹

19. “Open Source Licenses—Open Source Initiative,” 2011, http://www.opensource.org/licenses.

3.2.2 The Berkeley Packet Filter (BPF) Language

In the modern era, the volume of data that flows across networks has become so huge that it is very important for investigators to know how to filter it during both capture and analysis.

Libpcap includes an extremely powerful filtering language called the “Berkeley Packet Filter” (BPF) syntax.²⁰ Using BPF filters, you can decide which traffic to capture and inspect and which traffic to ignore. BPF allows you to filter traffic based on value comparisons in fields for Layer 2, 3, and 4 protocols. It includes built-in references called “primitives” for many commonly used protocol fields. BPF invocations can be extremely simple, constructed from primitives such as “host” and “port” specifications, or very arcane constructions involving specific field values by offset (even down to individual bits). BPF filters can also consist of elaborate conditional chains, nesting logical ANDs and ORs.

20. “TCPDUMP/LIBPCAP public repository,” 2010, http://www.tcpdump.org.

BPF syntax is so widely used and supported by traffic capture and analysis tools that every network investigator should be familar with it. In this section, we review basic elements of the BPF syntax.

3.2.2.1 BPF Primitives

By far, the easiest way to construct a BPF filter is to use BPF primitives to refer to specific protocols, protocol elements, or qualities of a packet capture. Primitives, as defined by the “pcap-filter” manual page, “usually consist of an id (name or number) preceded by one or more qualifiers.” The list of available primitives seems to grow with every revision of libpcap/tcpdump. The manual specifies three different kinds of qualifiers:²¹

21. “Man Page for pcap- (freebsd Section 0)—The UNIX and Linux Forums,” January 6, 2008, http://www.unix.com/man-page/freebsd/7/pcap-filter.

• type qualifiers say what kind of thing the id name or number refers to. Possible types are host, net, port and portrange.

• dir qualifiers specify a particular transfer direction to and/or from id. Possible directions are src, dst, src or dst, src and dst, addr1, addr2, addr3, and addr4.

• proto qualifiers restrict the match to a particular protocol. Possible protos are ether, fddi, tr, wlan, ip, ip6, arp, rarp, decnet, tcp and udp.

There are many other types of primitives available. Check the manual pages for your version of libpcap or libpcap-based tool.

Perhaps the most commonly used BPF primitive is “host id,” which is used to filter traffic involving a host, where id is set to an IP address or name. This can be further restricted by directionality using the primitives “dst host id” or “src host id” to specify that the IP address in question must be either the source or destination, respectively. The same principles apply for the “net,” “ether host,” and “port” classes of primitives. Filtering can also be done based on protocol, using primitives such as “ip” (to filter for IPv4 packets), “ether” (to filter for Ethernet frames), “tcp” (to filter for TCP segments), or “ip proto protocol” (to filter for IP packets that encapsulate the specified protocol). (Please see the tcpdump or “pcap-filter” manual page for a full list of options.) Typing something as simple as ‘tcp and host 10.10.10.10’ produces output that only includes TCP traffic to and from 10.10.10.10, with all other frames filtered out.

For instance, suppose we want to see only the traffic in which a computer with the IP address 192.168.0.1 communicates with any other system except 10.1.1.1 over ports 138, 139, or 445.

Here’s a BPF filter that would accomplish this:

'host 192.168.0.1 and not host 10.1.1.1 and (port 138 or port 139 or port
445)'

Commonly used BPF primitives include:²²

22. “tcpdump(8): dump traffic on network—Linux man page,” 2011, http://linux.die.net/man/8/tcpdump.

• host id, dst host id, src host id

• net id, dst net id, src net id

• ether host id, ether dst host id, ether src host id

• port id, dst port id, src port id

• gateway id, ip proto id, ether proto id

• tcp, udp, icmp, arp

• vlan id

There are many other BPF primitives. Please see the tcpdump or “pcap-filter” manual page for a full list.

3.2.2.2 Filtering Packets by Byte Value

In addition to primitive comparisons, the BPF language can be used to compare the values of any byte-sized (or multibyte-sized) fields within a frame. The BPF language provides syntax to specify the byte offset relative to the beginning of common Layer 2, 3, and 4 protocols. Important: Byte offsets are counted starting from 0! For example, the first byte in a structure is at offset “0,” and is referred to as the zero-byte offset. The seventh byte in a structure is at offset “6,” and is referred to as the sixth-byte offset.

Here are some examples:

• ip[8] < 64

This filter would match all packets in which the single byte field starting at the eighth byte offset of the IP header, is less than 64. This field is called the “Time to live,” or “TTL.” The default starting TTL in most Windows systems is 128, so this would probably weed out traffic from Windows systems on the LAN, while matching traffic from Linux systems (which usually have a default starting TTL of 64).

• ip[9] != 1

This filter would match frames whose single byte field at the ninth byte offset of the IP header does not equal “1.” Since the ninth byte offset of the IP header specifies the embedded protocol, and “1” represents “ICMP,” this would match any packet where the embedded protocol is not ICMP. This expression is equivalent to the primitive “not icmp”.

• icmp[0] = 3 and icmp[1] = 0

This construct isolates all ICMP traffic where the one-byte field at the 0 byte offset of the ICMP header is equal to 3, and where the one-byte field at the first byte offset is equal to 1. In other words, this filter matches only ICMP traffic where the packet is of ICMP type 3 code 0: “Network Unreachable” messages.

• tcp[0:2] = 31337

This construct introduces the notation for specifying a multibyte field (2 bytes) at the 0 byte offset of the TCP header (the TCP source port). In this case, it would be equivalent to using the BPF primitive “src port 31337.”

• ip[12:4] = 0xC0A80101

Here we see a 4-byte comparison of the contents of the IP header beginning at the 12th byte offset: the source IP address. Notice that the comparison is made in hexadecimal notation. Converted to decimal notation, this is equivalent to 192.168.1.1 (0xC0 = 192, 0xA8 = 168, 0x01 = 1). This filter matches all traffic where the source IP address is 192.168.1.1.

3.2.2.3 Filtering Packets by Bit Value

Of course, not all protocol header fields are a byte or more in size, and also begin and end precisely on the byte-boundary. Fortunately, the BPF language includes a way to specify smaller or offset fields (although it may seem complicated). To do this, we cite a specific byte or bytes (as explained above), and then compare them bit-by-bit to some value that we hope to find. This is called “bitmasking.” Essentially, we must identify one or more byte-sized chunks of data that contain the bits we are interested in, and then specify the particular bits of interest using a binary representation known as a “bitmask.” In the bitmask, “1” represents a bit of interest and “0” represents a bit we choose to ignore.

For example, if we’re only interested in the four lowest-order bits of a byte (the lowest-order “nibble”), we can represent this using a bitmask of “00001111” (in binary), which can also be represented as the value “0x0F” (in hexadecimal).

The IP header is a minimum of 20 bytes long, by specification. It is possible to include optional header fields which increase the IP header length. However, IP options are not commonly used in practice (for a while, it was vogue for attackers to use IP options to configure source routing, which facilitated man-in-the-middle attacks).

Let’s suppose that we’d like to filter for packets where IP options are set (in other words, the IP header length is greater than 20 bytes). The low-order nibble of the IP header represents the IP header length, measured in 32-bit “words” (each word is four bytes long). To find all packets where the IP header is greater than 20 bytes in length, we need to match packets where the low-order nibble is greater than five (5 words * 4 bytes per word = 20 bytes). To accomplish this, we create a BPF filter with a bitmask of “00001111” (0x0F), which is logically “AND-ed” with the targeted value. The resulting expression is:

ip[0] & 0x0F > 0x05

Likewise, if we want to find all of the IP packets where the “Don’t Fragment” flag (a single binary bit at byte offset 6 of the IP header) is set to 1, we can create a binary bitmask of “01000000” (0x40) to denote that we’re only interested in the packets where the second-to-highest-order bit is 1, and then compare this to byte offset 6 of the IP header:

ip[6] & 0x40 != 0

• libpcap includes the Berkeley Packet Filter (BPF) language

• Almost all libpcap-based utilities leverage this functionality

• Simple filters can be built with primitives

• Any byte-sized protocol field can be compared with a numerical value

• Bitwise filters can be built with bitmasking and value comparisons

• Complex expressions can be constructed via nested logical ANDs and ORs

3.2.3 tcpdump

Tcpdump is a tool for capturing, filtering, and analyzing network traffic. It was developed at LBNL, and first released to the public in January 1991.²³ Note that although tcpdump currently relies upon libpcap for functionality, its public release actually predates that of libpcap.

23. “Internet Archive Wayback Machine,” http://web.archive.org/web/20001001124902/http://www.tcpdump.org/release/tcpdump-3.5.tar.gz.

Tcpdump was designed as a UNIX tool. In 1999, reseachers at the Politecnico di Torino ported it to Windows and released “WinDump,” a Windows version.²⁴ WinDump is now hosted at a site maintained by Riverbed Technology, along with WinPcap. The tcpdump and WinDump utilities are not completely identical, but typically produce comparable results. It can happen that a packet capture produced by WinDump is not subsequently readable by tcpdump, but such an occurrence is infrequent. The command syntax for both tools is very similar. For the purposes of this book, you can consider them interchangeable except when differences are explicitly mentioned.

24. Riverbed Technology, “WinDump—Change Log,” December 4, 2006, http://www.winpcap.org/windump/misc/changelog.htm.

The basic purpose of tcpdump is to capture network traffic and then print or store the contents for analysis. Tcpdump captures traffic bit-by-bit as it traverses any physical media suitable for conducting link-layer traffic (copper, fiber, or even air). Since tcpdump is based on libpcap, it captures at Layer 2 (the data link layer). (By default, tcpdump’s output only displays Layer 3 and above protocol details, but the “-e” flag can be used to show Layer 2 data as well.)

Beyond merely capturing packets, tcpdump can decode common Layer 2 through 4 protocols (and some higher-layer protocols as well), and then display their information to the user. The decoded packets can be displayed in hexadecimal or in the ASCII equivalents (where the data is textual), or both.

There are two ways that tcpdump is most commonly employed. First, it is used to facilitate on-the-fly analysis for troubleshooting network issues in a tactical way. This typically encompasses capture, filtering, and analysis, all performed simultaneously. However, as common as this practice is, it tends to be suitable only when a quick glance at the data will suffice.

Tcpdump is also frequently used to capture traffic of interest passing on a target segment over a longer period of time, and store it for offline analysis and perhaps even future correlation with other data. Depending on the throughput and utilization of the network segment and the amount of each packet retained, the volume of data captured can be enormous.

3.2.3.1 Fidelity

One reason that tcpdump is such a powerful tool is that it is capable of capturing traffic with high fidelity, to the degree that the resulting packet capture can constitute evidence admissible in court. However, the quality of the packet capture can be impacted by hardware limitations and configuration constraints.

For example, tcpdump’s ability to capture packets may be limited by the clock speed of the processor in the capturing workstation. Capturing packets is a CPU-intensive activity. If the CPU of the capturing station becomes too heavily utilized, either by the tcpdump process itself or by any other running process, tcpdump will “drop packets”—in other words, it will fail to capture them. When this happens, there will be packets flying by that tcpdump simply doesn’t have the resources to pick up, and it will ignore them entirely.

If you are merely interested in sampling packets, this may not matter too much. However, if your goal is to obtain a high-fidelity capture of ongoing network traffic, missing nothing, this problem would be absolutely critical.

Especially on high-traffic networks, investigators may also be limited by disk space. If the capturing workstation does not have enough disk space to store the volume of traffic that investigators hope to be able to inspect, it may be necessary to either filter traffic upon capture or provide more disk storage.

One crucial configuration option for capturing packets using tcpdump is the snapshot length, known as “snaplen.” Snaplen represents the number of bytes of each frame that tcpdump will record. It is calculated from the zero-byte offset of the data link-layer frame.

Selecting the correct snaplen for a packet capture is critical. If the chosen snaplen is too short, data will be missing from every frame and can never be recovered. If the snaplen is too long, it may cause performance degradation, limit the volume of traffic that can be stored, and perhaps cause violations of regulations such as the United States Wiretap Act, which prohibit capturing communications contents except in certain circumstances.

Tcpdump’s default snaplen varies depending on the version used. As of version 4.1.0, by default tcpdump captures the full frame automatically.²⁵ This has undoubtedly saved many forensic investigators from much pain and heartache. Older versions of tcpdump had a default snaplen of 68, meaning only the first 68 bytes of the frame were actually captured. To capture full packet contents, users had to manually specify a larger value. Woe to the investigator who forgot to specify a longer snaplen, only to discover later that every packet was truncated, the contents never to be recovered!

25. “tcpdump,” http://www.tcpdump.org/release/tcpdump-4.1.1.tar.gz.

Once upon a time, many people recommended using a snaplen of 1,514 bytes because the maximum transmission unit (MTU) of Ethernet is 1,500 bytes. Since the Ethernet header itself is 14 bytes long, this meant that the total frame length was 1,514 bytes. Unfortunately, many people erroneously specified a snaplen of just 1,500 bytes, forgetting to account for the 14-byte Ethernet header. As a result, they accidentally truncated every IP packet by 14 bytes, which rendered full content reconstruction impossible.

Later versions of tcpdump allowed the user to specify “0” for the snaplen, which would tell tcpdump to automatically capture the entire frame, no matter how long it was. Modern versions of tcpdump maintain this syntax for backward compatibility.

For performance and regulatory reasons, investigators may still wish to specify a smaller snaplength. As described in the tcpdump manual, “taking larger snapshots both increases the amount of time it takes to process packets and, effectively, decreases the amount of packet buffering. This may cause packets to be lost. You should limit snaplen to the smallest number that will capture the protocol information you’re interested in.”²⁶

26. “Manpage of TCPDUMP,” March 5, 2009, http://www.tcpdump.org/tcpdump_man.html.

3.2.3.2 Filtering Packets with tcpdump

Filtering during capture is very important because resources such as disk space, CPU cycles, and traffic aggregation capacity are always limited. Filtering indiscriminately, however, can cause loss of evidence, which can never be recaptured. You only get one chance to capture a frame at the moment it zips past on the wire (or through the air). Once you miss it, it’s gone forever and will never be part of your analysis.

The ability to filter is also crucial during analysis. When resources permit, it is often useful to “throw a wide net” and sift through the data offline. However, it can be very difficult to find the packets of interest within potentially huge volumes of data.

Fortunately, since tcpdump is a libpcap-based tool, it incorporates the BPF language, which investigators can use to filter traffic during capture and analysis. One good strategy for analysis of large volumes of traffic is to begin by filtering out any types of traffic that are not related to the investigation. For example, imagine that we have a packet capture that contains 75% web traffic (TCP port 80), which is fairly typical for an enterprise network. If our investigation is entirely unrelated to web traffic, we might decide to use a BPF filter of 'not (tcp and port 80)' to cut the volume by 75%. Phew!

Here’s an example that shows how tcpdump is used to display traffic from the “eth0” network interface, excluding TCP port 80 traffic:

# tcpdump -nni eth0 'not (tcp and port 80)'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:49:33.631163 IP 10.30.30.20.123 > 10.30.30.255.123: NTPv4, Broadcast,
    length 48
12:49:38.197072 IP 192.168.30.100.57699 > 192.168.30.30.514: SYSLOG local2.
    notice, length: 1472
12:49:38.197319 IP 192.168.30.100.57699 > 192.168.30.30.514: SYSLOG local2.
    notice, length: 1472
12:49:38.197324 IP 192.168.30.100 > 192.168.30.30: udp
12:49:38.197327 IP 192.168.30.100 > 192.168.30.30: udp
12:49:38.197568 IP 192.168.30.100.57699 > 192.168.30.30.514: SYSLOG local2.
    notice, length: 1472
12:49:38.197819 IP 192.168.30.100.57699 > 192.168.30.30.514: SYSLOG local2.
    notice, length: 1472
12:49:38.197825 IP 192.168.30.100 > 192.168.30.30: udp
12:49:38.197827 IP 192.168.30.100 > 192.168.30.30: udp
12:49:38.197829 IP 192.168.30.30.39879 > 10.30.30.20.53: 16147+ PTR?
    100.30.168.192.in-addr.arpa. (45)
10 packets captured
10 packets received by filter
0 packets dropped by kernel

In Figure 3-1, we have shown a few commonly used tcpdump command-line options. There are many more command-line options for tcpdump than are presented here, but these are some of the most important for capturing traffic and storing it in files.

Figure 3-1 Handy tcpdump command-line options. See the tcpdump manual for more details.²⁷

27. Ibid.

The -C option, used with -w, allows analysts to specify the maximum size of a packet capture file, in millions of bytes. Once the file has reached the specified size, tcpdump closes it and opens a new file that has the same name with a number appended to it (starting at 1 and incrementing). This allows investigators to keep pcap files limited to a size that is easily transferred and practical for inspection with other analysis tools such as Wireshark. In addition, using the -W option along with the -C option, analysts can specify how many of these output files should exist on the hard drive at any one time. Once the limit has been reached, tcpdump begins to overwrite the oldest file, creating a rotating store of packets that occupies a fixed amount of disk space.

Below are five common invocations of tcpdump, which illustrate some of the basic functionality:

• tcpdump -i eth0 -w great_big_packet_dump.pcap

This is the simplest case of listening on interface eth0 and writing all of the packets out to a single monolithic file.

• tcpdump -i eth0 -s 0 -w biggest_possible_packet_dump.pcap

This instance is similar to the one above, except that by setting the snaplength to zero, we are telling tcpdump to grab the entire frame regardless of its size (rather than the first 68 bytes only). Note that specifying -s 0 is not necessary for newer versions of tcpdump, because the command functionality was updated to make this behavior the default.

• tcpdump -i eth0 -s 0 -w targeted_full_packet_dump.pcap 'host 10.10.10.10'

Here we introduce a simple BPF filter to grab and store in their entirety only those packets sent to or from the host at the address “10.10.10.10.”

• tcpdump -i eth0 -s 0 -C 100 -w rolling_split_100MB_dumps.pcap

Here we abandoned our host-based targeting, and instead we are grabbing all of every frame, but splitting the captures into multiple files no larger than 100MB each.

• tcpdump -i eth0 -s 0 -w RFC3514_evil_bits.pcap 'ip[6] & 0x80 != 0'

Finally, we introduce a more complicated BPF filter, in which we target the first byte of the IP fragmentation fields (byte offset 6). We employ a bitmask to narrow our inspection to the single highest order bit, most commonly known as the IP “reserved bit,” and we capture and store the packet only if the reserved bit is nonzero.

The original RFC 791 “Internet Protocol,” published in 1981, specified that the very first, or “high order,” bit of the sixth byte offset would be “reserved”—in other words, unused—and as such it should be zero.²⁸

28. Information Sciences Institute, USC, “RFC 791—Internet Protocol: Darpa Internet Program Protocol Specification,” September 1981, http://www.rfc-editor.org/rfc/rfc791.txt.

On April Fool’s Day in 2003, Steve Bellovin published RFC 3514, “The Security Flag in the IPv4 Header,” which proposed that this bit be used as a “security flag.” Bellovin suggested that any packet that had been built for the purpose of benign and normal IP traffic keep the bit unset, but that any packet that had been built for malicious or evil intent must set this bit to one. He reasoned that as a result, firewall vendors and those interested in intrusion detection would have a much easier time detecting which packets to disallow or to alert on.²⁹

29. S. Bellovin, “RFC 3514—The Security Flag in the IPv4 Header,” IETF, April 1, 2003, http://www.rfceditor.org/rfc/rfc3514.txt.

The following tcpdump invocation allows us to capture only traffic with the RFC 3514 Evil Bit set, while ignoring all the rest:

tcpdump -i eth0 -s 0 -w RFC3514_evil_bits.pcap 'ip[6] & 0x80 != 0'

For those attackers who are interested in adhering to specification, Jason Mansfield has created an “evil bit changer.” This handy utility captures traffic or reads it from a file, sets the Evil Bit to “1,” recalculates the IP header checksum, and then forwards the frame along to its intended destination.³⁰

30. Jason Mansfield, “evilbitchanger—Set the evil bit on IP packets.—Google Project Hosting,” 2011, http://code.google.com/p/evilbitchanger/.

3.2.4 Wireshark

Wireshark is a graphical, open-source tool designed for capturing, filtering, and analyzing traffic. Wireshark’s easy-to-use graphical user interface makes it a great first tool for novice network forensics analysts, while its advanced packet filtering capabilities, protocol decoding features, and support for the Packet Details Markup Language (PDML) also make it extremely useful for experienced investigators.

Wireshark (originally named “Ethereal”) was initially released in 1998 by Gerald Combs. (The name was changed in 2006 when Combs moved to CACE Technologies because his previous employer maintained the trademark “Ethereal.” Subsequently, CACE Technologies was acquired by Riverbed Technology.) Over time, Wireshark has continued to mature, and there are now hundreds of contributing authors.³¹

31. “Wireshark About,” 2011, http://www.wireshark.org/about.html.

Wireshark allows you to capture packets on any system network interface, assuming you have appropriate permissions to do so and your network card supports sniffing. Wireshark can display packets as they are captured in real time. It is a very powerful protocol analyzer and uses a lot of processing power crunching through protocol data. Recall that in our previous discussion of libpcap that if the CPU load is too high, packets can get dropped. It is worth carefully examining and tuning Wireshark’s options for capturing packets, especially since displaying and filtering packets while capturing can use up CPU power and result in lost packets.

Figure 3-2 shows a screenshot of Wireshark’s “Capture Options” panel.

Figure 3-2 Wireshark’s “Capture Options” screen (Wireshark version 1.2.11).

3.2.5 tshark

Tshark is a command-line network protocol analysis tool that is part of the Wireshark distribution. Like Wireshark, it is libpcap-based, and can read and save files in the same standard formats as Wireshark. In addition to analysis, you can also use tshark to capture packets.³² The example below shows tshark capturing traffic on the network interface “eth0,” filtering out all port 22 traffic, and storing the results in the file “test.pcap.”

32. “tshark—The Wireshark Network Analyzer 1.5.0,” http://www.wireshark.org/docs/man-pages/tshark.html.

# tshark -i eth0 -w test.pcap 'not port 22'
Capturing on eth0
235

Please see Section 4.1.2.3 for more discussion of tshark’s protocol analysis capabilities.

3.2.6 dumpcap

The Wireshark distribution also comes with a command-line tool, “dumpcap,” which is specifically designed to capture packets.³³ If you would like to use the Wireshark distribution to capture packets for an investigation, the dumpcap tool is probably your best choice. Since dumpcap is a specialized tool designed just for capturing packets, it takes up fewer system resources, maximizing your capture capabilities. Furthermore, many operating systems limit traffic sniffing to programs that are run with administrative privileges. Wireshark itself is a complex, multifaceted program. From a security perspective, it is safer to limit administrative access to the smaller, simpler dumpcap program.

33. “Wireshark,” http://www.wireshark.org/docs/wsug_html_chunked/AppToolsdumpcap.html.

Dumpcap automatically writes packet captures to a file. Here is an example in which we use dumpcap to capture traffic on the interface eth0, filter out all port 22 traffic, and save the results in the file “test.pcap”:

$ dumpcap -i eth0 -w test.pcap 'not port 22'
File: test.pcap
Packets: 12
Packets dropped: 0

In Chapter 4, we discuss Wireshark’s packet analysis and data extraction capabilities.

3.3 Active Acquisition

As we discussed in Chapter 2, evidence lives in many places throughout a network. In addition to capturing network traffic as it travels through wires and in the air, we may also choose to gather evidence from network devices, including firewalls, web proxies, logging servers, and more. In many cases, these devices cannot be removed from the production environment without causing serious damage to business operations. In other cases, the evidence stored is highly volatile and must be collected while the system is still running, perhaps even still on the network. For these reasons and others, network forensic investigators often interact with network devices that are live and on the network.

By definition, active evidence acquisition modifies the environment. Investigators should be highly aware of the various ways in which live acquisition modifies the devices and environment under investigation, and work to minimize the impact.

3.3.1 Common Interfaces

In this section, we review common ways that investigators gain access to live network-based devices, including:

• Console

• Secure Shell (SSH)

• Secure Copy (SCP) and SSH File Transfer Protocol (SFTP)

• Telnet

• Simple Network Management Protocol (SNMP)

• Trivial File Transfer Protocol (TFTP)

• Web and proprietary interfaces

In later chapters, we review evidence acquisition and analysis relating to specific devices, including firewalls, web proxies, central log servers, and more.

3.3.1.1 Console

The console is an input and display system, usually a keyboard and monitor connected to a computer. Many network devices have a serial port that you can use to connect a terminal to the console.

It is possible to connect modern laptops and desktops to the serial console of network devices using USB-to-serial adapters. Figure 3-3 shows an example of a Keyspan serial-to-USB adapter. Once your forensic workstation is connected to the serial port using a serial-to-USB adapter, you can use the Linux “screen” command to connect to the console and log your session:

$ screen -L /dev/ttyUSB0

Figure 3-3 A Keyspan serial-to-USB adapter, which can be used to connect a modern laptop or desktop to the serial console of a network device.

Whenever possible, it is best to connect directly to the console of a network device rather than connecting remotely over the network. When you connect to a device over the network, you create additional traffic and often unintentionally change the state of local networking devices (such as CAM tables, log files, etc.). When you connect directly to the console, you can dramatically reduce your footprint.

3.3.1.2 Secure Shell (SSH)

The Secure Shell protocol (SSH) is a common way for investigators to gain remote command-line access to systems containing network-based evidence. Developed as a replacement for the insecure Telnet and rlogin, SSH encrypts authentication credentials and data in transit. This means that even if the SSH traffic is intercepted, an attacker would be unable to recover the username, password, or contents of the communication. OpenSSH is a widely used implementation of SSH, which has been released free and open-source under a BSD license.

Most modern network devices now support SSH as a method for remote command-line interaction. Here is an example in which we use SSH to log into a system remotely on TCP port 4022, using the account “sherri”:

$ ssh -p 4022 [email protected]

You can also use SSH to run commands remotely. In the following example, we execute the command “hostname” to retrieve the hostname of the remote server (which is named “remote”):

$ ssh -p 4022 [email protected] 'hostname'
remote

3.3.1.3 Secure Copy (SCP) and SFTP

In addition to providing interactive command-line access, SSH implements the Secure Copy Protocol (SCP), which is a command-line utility designed to transfer files between networked systems. Local files can be referred to by their local paths, while remote files are specified by the username, hostname, and path to the file on the remote system, as in the following example:

$ scp -P 4022 [email protected]:/etc/passwd .

Here we copied the /etc/passwd file from the remote server to the current working directory on our local system, using the account “jonathan.” Note the single dot argument at the end of the line, which specifies that we want to copy in our local directory. Also note the capital “P” used to specify the port (in contrast, the “ssh” command uses a lowercase “p” for this purpose, which can be confusing).

The SSH File Transfer Protocol, or SFTP, is an alternative protocol used in conjunction with SSH for secure file transfer and manipulation. It is more portable and offers more capabilities than SCP, but file transfer tends to be slower.

3.3.1.4 Telnet (yes, Telnet)

You may still encounter network devices that can only be accessed remotely using Telnet. Telnet is a command-line remote communications interface, which was originally developed in 1969 and eventually standardized by the IETF in RFCs 854 and 855.^34,35 It became extremely widespread and is still used to this day for remote access on many network devices. In addition to connecting to Telnet servers, the Telnet client can be used to interact with a wide variety of servers, such as SMTP and HTTP.

34. J. Postel and J. Reynolds, “RFC 854—Telnet Protocol Specification,” IETF, May 1983, http://rfc-editor.org/rfc/rfc854.txt.

35. J. Postel and J. Reynolds, “RFC 855—Telnet Option Specifications,” IETF, May 1983, http://rfc-editor.org/rfc/rfc855.txt.

As was common for communications protocols developed in the early days of networking, Telnet had only limited security built-in. All transactions are in plain text, so authentication credentials and data are sent unencrypted across the wire. If you are investigating a network that an attacker may be monitoring, be very careful because simply logging into a device via Telnet will expose its credentials to anyone who can capture traffic on the local segment.

Despite its serious security drawbacks, in many cases Telnet is the only option for remote access to a network device, because some devices have such limited hardware or software capacities that they cannot upgrade to more secure remote access tools such as SSH.

Here is an example in which we use Telnet to connect to a remote HTTP server on port 80:

$ telnet lmgsecurity.com 80
Trying 204.11.246.1...
Connected to lmgsecurity.com.
Escape character is '^]'.
GET / HTTP/1.1
Host: lmgsecurity.com

HTTP/1.1 200 OK
Date: Sun, 26 Jun 2011 21:39:33 GMT
Server: Apache/2.2.9 (Debian) PHP/5.2.6-1+lenny10 with Suhosin-Patch
mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 OpenSSL/0.9.8g mod_perl/2.0.4
Perl/v5.10.0
Last-Modified: Thu, 23 Jun 2011 22:40:55 GMT
ETag: "644284-17da-4a668c728ebc0"
Accept-Ranges: bytes
Content-Length: 6106
Content-Type: text/html

3.3.1.5 Simple Network Management Protocol (SNMP)

One of the most commonly used protocols for network device inspection and management is the Simple Network Management Protocol (SNMP). Using SNMP, you can poll networked devices from a central server, or push SNMP information from remote agents to such a central aggregation point. SNMP is frequently used as a medium for communicating and aggregating both network management information (often of interest to the forensic analyst) and security event data (usually of great interest to forensic analysts). In network forensics, SNMP is commonly used in one of two ways: event-based alerting and configuration queries.

SNMP was designed to be extensible through the definition of the “management information base” (MIB), which basically describes the database of managed information. Unfortunately, this MIB definition is based on ASN.1 notation, which has been found to have multiple flaws both in design and in implementation (parsing languages are notorious for their propensity for errors in input validation). There are also problems with the authentication model, which have been best addressed in the current version (SNMPv3).

Here’s a list of the basic SNMP operations:

• Polling: GET, GETNEXT, GETBULK

• Interrupt: TRAP, INFORM

• Control: SET

As flawed as SNMP has historically been, it is still widely used and can be a critical component in network forensic investigations. There are far too many SNMP-based tools in wide deployment to even begin to discuss them all here. However, they all work in roughly the same way, as we now discuss.

In products that poll SNMP agents, there are a few methods available: GET, GET-NEXT, and GETBULK. These operations are employed to retrieve information from the managed device, including routing tables, system uptime, hostname, ARP tables, CAM tables, and more. The “GET” operation is used to retrieve one piece of information, whereas the “GETNEXT” and “GETBULK” commands are used to retrieve multiple pieces of information.

Many devices can be configured to emit SNMP traps, which are used to communicate information about events on the device. Rather than polling, the central SNMP console can receive events via the SNMP TRAP operation from many sources as they occur and aggregate them. SNMP traps ensure timely notifications while reducing unnecessary network traffic.

SNMP can also be used to control the configuration of remote devices using the “SET” command.

SNMPv1 and SNMPv2 use “community strings” for authentication, which are sent in plain text across the network. As a result, there is significant risk of credential theft if the community strings are intercepted as they are sent across the network. Commonly, the community string “public” has read-only access to the MIB, and “private” has read-write access. SNMPv3 supports strong encryption algorithms that can be used to encrypt authentication data and packet contents, if these options are selected.

3.3.1.6 Trivial File Transfer Protocol (TFTP)

The Trivial File Transfer Protocol (TFTP) was first published in 1980, and designed as a simple, automated means of transferring files between remote systems. Like Telnet, TFTP was designed before most people were concerned about “bad actors” on the network. Consequently, it fit a useful niche: file transfer without the burden of authentication. The features of TFTP are extremely limited; one of the design goals was to keep the service very small so that it could run on systems with extremely limited storage space and memory. It runs over UDP on port 69.^36,37

36. K. R. Sollins, “RFC 783—TFTP Protocol (revision 2),” IETF, June 1981, http://rfc-editor.org/rfc/rfc783.txt.

37. K. R. Sollins, “RFC 1350—TFTP Protocol (revision 2),” IETF, July 1992, http://www.rfc-editor.org/rfc/rfc1350.txt.

Despite the lack of security, TFTP remains in widespread use today (generally restricted to internal networks). It has been incorporated into many network devices, from Voice over IP phones to firewalls to desktop BIOSs. TFTP is often used as a means through which distributed devices can download updates from a central server within an organization. On many routers and switches, it is used to back up and restore files. It was also used for payload propagation in both the CodeRed and Nimda outbreaks.

Forensic analysts may need to use TFTP to export files from a router, switch, or other device that does not support SCP or SFTP for such operations.

3.3.1.7 Web and Proprietary Interfaces

These days most commercial network devices, from DSL routers to wireless access points, come with a web-based management interface. Through HTTP or HTTPS, you can access configuration menus, event logs, and other data that the device contains. Web interfaces are popular because they are very portable; they do not require the user to install a special client to access the device.

Typically, web interfaces are available by default as unencrypted HTTP sessions, in which case the login credentials and any data transferred over the connection is unencrypted and easily intercepted. Many vendors also offer SSL/TLS-encrypted web interfaces, although the certificates used with these services often have errors, which cause problems with validation.

Many vendors such as Cisco and netForensics have also developed Java-based cross-platform interfaces or other types of proprietary interfaces for their devices.

For forensic investigators, one of the biggest challenges with GUI interfaces is logging forensic activities. Text-based interfaces can typically be used with “script” or other tools that log commands; with GUIs, it can be harder for an investigator to automatically log activities. Often, the best fallback is screenshots and a good notebook.

3.3.2 Inspection Without Access

In many cases, it is desirable to gain information about a device’s configuration or state without accessing the device at all via an interface. There are also times when the password to user interfaces is not available. It is possible to gather extensive information about a device’s configuration and state through external inspection, using port scanning, vulnerability scanning, and other methods.

3.3.2.1 Port Scanning

Port scanning, using a tool such as nmap, is an effective way to retrieve information about open ports and software versions of a device. Note that port scanning is an active process, meaning that you will generate network traffic and, in the process, modify the state of the targeted device.

3.3.2.2 Vulnerability Scanning

Vulnerability scanning is the next level of active external inspection. In addition to port scanning, vulnerability scanners test target systems for a wide variety of known vulnerabilities. If you are concerned that your target of interest may be compromised, this can sometimes provide strong clues as to how the compromise may have occurred.

Vulnerability scanning generates network traffic and modifies the state of the targeted device. In some cases, it can even crash the targeted device. Be cautious and understand the options you have selected before running a vulnerability scanner against your target of interest.

3.3.3 Strategy

Here are some tips for collecting evidence from network devices. In general, our goal is to preserve the evidence while minimizing our footprints throughout the environment. Of course, your specific strategy will vary for each investigation.

• Refrain from rebooting or powering down the device. A lot of network-based evidence exists as volatile data in the memory of network devices. For example, ARP tables on routers change dynamically and typically would not survive a reboot. The current state of network connectivity is similarly volatile. Make sure that you capture all the volatile evidence you need before allowing the device to be rebooted. A reboot may also cause modifications to persistent logfiles stored on disk, particularly if local disk space is limited and log files are overwritten at designated intervals or when space is low.

• Connect via the console rather than over the network. It is usually preferable to connect via the system console whenever possible, rather than over the network. Connecting to a device over the network will necessarily generate network traffic and modify the state of the device’s network connections. It may also alert an attacker on the network to your presence. Connecting instead directly to the system console will minimize your network footprint.

• Record the system time. Always check to see what the time skew is between the network device you are investigating and a reliable source. Even a small time skew can make it very difficult to correlate evidence, unless you are aware of it and can compensate. Unfortunately, most forensic tools do not have easy ways of adjusting specific logs for time skew, so you may need to compare skewed records manually or using customized scripts. If you have access to a network device for an extended period of time, it is a good idea to collect the time at regular intervals because the time skew may change.

• Collect evidence according to level of volatility. This is a general rule of thumb for all digital forensic investigations: when all else is equal, collect the most volatile evidence first and work your way down to more persistent evidence. This will increase your chances of capturing all the evidence you are after. Of course, sometimes other factors may cause you to collect evidence in a different order—for example, if the most volatile evidence is very difficult to obtain or perhaps not as important. In many cases, you simply cannot collect all the evidence you want, and must pick evidence from one system over another.

• Record your investigative activities. When connecting via a command-line interface, you can often record your commands using utilities such as “screen” or “script”. These tools can help you clearly delinate what footprints were the result of your own activities and what were those of the existing system. Recording your own commands also helps you stay organized, and will help you become a more efficient investigator because you will be able to refer to your prior work in other investigations.

Many investigators are wary of recording their own commands because they are afraid of messing up and leaving a record. Don’t worry about making mistakes! All investigators fat-finger commands or have to look up proper command syntax from time to time. This is the nature of network forensics; we deal with so many different types of equipment that we must often refer to manuals or rely on local network staff for specific syntax. It’s much better to maintain a record of all your activities, fat-fingered commands and all, than to not maintain any record, or worse, to maintain an inaccurate record of your commands that does not match what you actually typed.

Graphical interfaces are challenging because it is more difficult to record your investigative activities. Whenever possible, take screen captures, photos or recordings of your graphical connections.

3.4 Conclusion

As an investigator, you will always leave footprints. The more places you look for evidence, the more footprints you will leave. Fortunately, there are ways that you can minimize your impact on the environment.

In this chapter, we discussed the concepts of passive and active evidence acquisition. We reviewed common methods for gaining access to network traffic and software used for capturing, filtering, and storing packet data. We also discussed interfaces used to actively gather evidence from network devices, ranging from the system console to proprietary graphical interfaces accessed via a remote client. Finally, we talked about strategies for reducing your footprint on the network, while still obtaining the evidence you seek.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. Evidence Acquisition

Create new playlist

Sign In

Sign Up

Chapter 3. Evidence Acquisition

3.1 Physical Interception

3.1.1 Cables

3.1.1.1 Copper

3.1.1.2 Optical

3.1.1.3 Intercepting Traffic in Cables

3.1.2 Radio Frequency

3.1.3 Hubs

3.1.4 Switches

3.1.4.1 Obtaining Traffic from Switches

3.2 Traffic Acquisition Software

3.2.1 libpcap and WinPcap

3.2.2 The Berkeley Packet Filter (BPF) Language

3.2.2.1 BPF Primitives

3.2.2.2 Filtering Packets by Byte Value

3.2.2.3 Filtering Packets by Bit Value

3.2.3 tcpdump

3.2.3.1 Fidelity

3.2.3.2 Filtering Packets with tcpdump

3.2.4 Wireshark

3.2.5 tshark

3.2.6 dumpcap

3.3 Active Acquisition

3.3.1 Common Interfaces

3.3.1.1 Console

3.3.1.2 Secure Shell (SSH)

3.3.1.3 Secure Copy (SCP) and SFTP

3.3.1.4 Telnet (yes, Telnet)

3.3.1.5 Simple Network Management Protocol (SNMP)

3.3.1.6 Trivial File Transfer Protocol (TFTP)

3.3.1.7 Web and Proprietary Interfaces

3.3.2 Inspection Without Access

3.3.2.1 Port Scanning

3.3.2.2 Vulnerability Scanning

3.3.3 Strategy

3.4 Conclusion

Table of Contents for
Chapter 3. Evidence Acquisition