Chapter 12. Malware Forensics

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Malware Forensics

“andy; I’m just doing my job, nothing personal, sorry”

—String found within the W32/MyDoom self-mailer worm code, circa 2004¹

1. G. Sinclair, “Win32.Mydoom.B@mm (Win32.Novarg.B@mm) RemovalTool,” bitdefender, January 28, 2004, http://www.bitdefender.com/VIRUS-1000035-en-Win32.Mydoom.B@mm-(Win32.Novarg.B@mm).html.

Malware is big business. As computers themselves have evolved to be increasingly networked, so too has malicious software, or “malware.” Many people have remarked upon the strong analogies between malware and natural organisms, from self-reproductive techniques to the emergence of evolution. In real life, viruses, parasites, and bacteria spread by piggybacking on the normal mechanisms that hosts use to communicate and exchange resources. Similarly, as personal computers evolved from isolated word processors into complex network-oriented communications devices, the strategies and behaviors of malware have become increasingly network-oriented.

There are many goals of malware forensics, including:

• Understanding malware and associated vulnerabilities in order to produce antivirus/IDS signatures

• Detecting compromised systems on the network

• Determining the scope of a breach, after the fact

• Containing an infection

• Tracking down the source of malware

• Gathering evidence for court

In this chapter, we explore malware forensics, particularly as it relates to network forensics, and discuss how network instrumentation can be used to help identify and track malware on the network. We begin by reviewing a few of the recent trends in malware evolution, and discuss how they impact network forensic techniques. Next, we discuss propagation, command-and-control, and payload behaviors, including ways that these can be detected and addressed through network forensics. Finally, we look to the future and discuss how network forensic investigators can prepare for the malware of tomorrow.

12.1 Trends in Malware Evolution

Once upon a time, malware was spread via floppy disks by infecting the MBR upon boot or through transfer of infected files and programs (i.e., freeware utilities). While USB drives and similar physical devices are still a notable source of malware, the vast majority of malware activity occurs over the network. Even when the malware is injected via a physical storage medium, it typically still communicates over the network after infecting a new host. Network forensics and malware analysis have increasingly overlapped as malware has evolved to become more dependant on the network for propagation, control, and payload functionality.

12.1.1 Botnets

Modern botnets are the convergence of advancements in remote control, automated propagation, and hierarchical, distributed management techniques. Despite their sophistication, botnets are not anything new. Long before the “Storm” botnet compromised millions of systems worldwide,² there were, for example, the rudimentary Tribe Flood Network and trinoo (1999), which were distributed networks of compromised hosts controlled by attackers through hierarchical, redundant command-and-control channels. The evolution of botnets from their early roots has been strongly influenced by the development of intrusion detection systems, which led malware developers to incorporate sophisticated IDS evasion and anti-forensic features.

2. Bruce Schneier, “Gathering ‘Storm’ Superworm Poses Grave Threat to PC Nets,” Wired.com, October 4, 2007, http://www.wired.com/politics/security/commentary/securitymatters/2007/10/securitymatters_1004.

12.1.1.1 Early Developments in Distributed Management

During the early to mid-1990s, attackers developed methods to remotely control compromised systems through the network. As networking became ubiquitous, there was an explosion of malware that took advantage of networked resources in order to send out email spam, conduct distributed denial-of-service (DDoS) attacks, host pirated software, and more. During this time period, compromised systems were dubbed “zombies” because they had rudimentary remote control and automation, usually limited to DDoS attacks or other simple behaviors.

During the late 1990s, attackers developed advanced systems for centrally coordinating the activities of hundreds or even thousands of compromised systems. The Tribe Flood Network (TFN), for example, emerged around 1999 in order to facilitate DDoS attacks.³ Notably, it was designed with redundant command-and-control channels (often referred to as “C&C” or “C²” channels) so that attacker systems controlled a network of compromised “client” systems, which in turn controlled an even larger network of “daemon” systems, which could be instructed to simultaneously attack victims upon command.⁴

3. Dave Dittrich, “Tribe Flood Network,” November 1, 1999, http://staff.washington.edu/dittrich/talks/cert/tfn.html.

4. Dave Dittrich, “The ‘Tribe Flood Network’ Distributed Denial of Service Attack Tool,” October 21, 1999, http://staff.washington.edu/dittrich/misc/tfn.analysis.

The Tribe Flood Network 2000 (TFN2K) system incorporated additional anti–network forensics techniques, such as randomized and encrypted C&C packets that made traffic filtering difficult.⁵

5. Jason Barlow and Woody Thrower, “TFN2k An Analysis,” Packet Storm, March 7, 2000, http://packetstormsecurity.org/files/view/10135/TFN2k_Analysis-1.3.txt.

12.1.1.2 Early Developments in Full-Featured Control

At the same time, other “black hat” and “gray hat” developers were focusing on extending the features of remote access trojans (RATs), designed to facilitate remote control of individual compromised endpoints. During the late 1990s, RATs such as the Back Orifice and Sub7 applications emerged. These provided remote attackers with a wide range of features, enabling powerful, point-and-click control of zombies. Neither BackOrifice nor Sub7 were initially designed for a hierarchical management of a distributed network of agents; the goal was to provide flexible, full-featured remote control of a compromised host by a single attacker.

Ultimately, botnet authors married sophisticated endpoint control with automated propagation techniques, automated self-update mechanisms, and multilayer, hierarchical, and/or peer-to-peer C&C channels. Botnets today incorporate many of the same features as legitimate enterprise networks, including internal DNS, web, email, and software update mechanisms. In accomplishing this, botnet authors have essentially built the equivalent of enterprise systems management software that scales far better than many aboveboard commercial offerings. This all succeeds in a hostile environment where thousands of people are constantly trying to disassemble the network. No wonder botnet developers are making money!

12.1.1.3 Implications for Network Forensics

To date, the most mature and most publicized aspects of malware analysis continue to focus on reverse-engineering the behavior of samples caught “in the wild.” This is conducted in an effort to understand the nature and mechanisms of compromise and to develop reliable antivirus or IDS signatures for detecting the presence of malware. Such samples have historically been recovered by identifying a known-to-be-compromised host and extracting the malicious code from the running or powered-down system (i.e., from either memory or disk). Given the accompanying maturity of malware authors’ attempts to obfuscate or hide their code on the compromised host, this can be a Herculean, if not impossible task.

As a result, network forensics has had to step into the breach—quite literally. Once malware began to emerge that was difficult or impossible to detect simply by monitoring host-based system activity, defenders turned to monitoring and analyzing hosts’ external network behaviors to find, track, block, and prevent malware.

12.1.2 Encryption and Obfuscation

Encryption techniques have been used in malware since the early 1990s, when the Cascade virus encrypted its own payload in order to avoid detection by antivirus software.⁶ Malware authors use encryption to hide functionality and create random-appearing payloads that are difficult for antivirus software and IDS systems to detect. Over time, malware has evolved increasingly sophisticated techniques to obscure decryptors and decryption keys (at times even requiring the malware to brute-force decrypt itself) in order to evade detection and forensic analysis.

6. Peter Szor, The Art of Computer Virus Research and Defense (Upper Saddle River, NJ: Addison-Wesley, 2005).

12.1.2.1 Early IDS/Antivirus Evasion

Early network-based techniques to evade detection included session splicing and fragmentation. In session splicing, the attacker chops up a string from a session and splits it across multiple packets to foil NIDS/NIPS pattern matching. The network monitoring device must reassemble the session in order to detect the string, which is processor-intensive. Similarly, fragmentation attacks are designed to split individual packets into much smaller packets. The NIDS/NIPS must reassemble the packet fragments to properly analyze them, which uses up significant resources.

12.1.2.2 Modern Web Obfuscation/Encryption

Modern malware, which is often web-based, commonly leverages obfuscation techniques to embed malicious code (i.e., JavaScript) in web pages. Obfuscation takes the form of simple Base64 encoding or XOR-ing, or more complex layered systems of obfuscation. While malware analysts can and do de-obfuscate malicious code, often using automated tools, these techniques help attackers evade web filters and NIDS/NIPS systems, which are often unable to properly de-obfuscate on the fly.⁷

7. “Malicious Hidden Iframes using Publicly Available Base64 encode/decode Script,” Zscaler Research, May 2, 2010, http://research.zscaler.com/2010/05/malicious-hidden-iframes-using-publicly.html.

12.1.2.3 Hiding C&C Channels

With the rise of remote-control zombies and the maturation of C&C channels, malware began incorporating obfuscation and encryption not only to disguise the injection vectors and payloads, but also to hide the content of the C&C channel itself. This was a very important development that allowed botnets to evade detection and analysis through network forensics of packet contents, long after the initial compromise. For example, Symantec researcher Gilou Tenebro writes that the Waledac worm’s HTTP-based command-and-control messages “[go] through at least four transformations before being sent to its peer. Anybody monitoring the HTTP packets on the wire will not easily be able to comprehend the messages.”⁸

8. Gilou Tenebro, “W32.Waledac Threat Analysis,” Symantec, 2009, http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/W32_Waledac.pdf.

The result is that network forensic investigators must increasingly turn to statistical flow analysis rather than content-based analysis in order to efficiently detect and dissect malware on the network.

12.1.2.4 Maintaining Control

Encryption is used not just to evade detection and analysis by network defenders but also to ensure that the botherders maintain control of their networks. Some botnets require client and/or server systems to cryptographically authenticate to the C&C channel, further foiling attempts by network forensic analysts to investigate or interfere with botnet functionality.

An interesting twist, illustrated by the Storm worm, is the use of encryption for the purposes of segmenting botnets to facilitate resale. Botnets themselves have become valuable commodities, rented and sold through underground black markets for purposes such as distributing spam, coordinating DDoS attacks, and stealing financial information. In 2007, the Storm worm began to use a weak (40-byte) encryption key to encrypt communications between peer-to-peer nodes (later, this was beefed up to 64-bit RSA encryption). While this made it more difficult to recover the contents of the packets on the fly, and added significant time and effort to traffic content analysis, malware analysts could still reverse-engineer samples to recover encryption keys and decrypt the packet contents. Joe Stewart of SecureWorks commented that the encryption did ensure that “each node will only be able to communicate with nodes that use the same key. This effectively allows the Storm author to segment the Storm botnet into smaller networks. This could be a precursor to selling Storm to other spammers, as an end-to-end spam botnet system, complete with fast-flux DNS and hosting capabilities.”⁹

9. Joe Stewart, “The Changing Storm,” Dell SecureWorks, October 14, 2007, http://www.secureworks.com/research/blog/index.php/2007/10/15/the-changing-storm/.

Note that Waledec, known as a “new and improved version of the Storm botnet,” upgraded to AES 128-bit and RSA 1,024-bit encryption (along with Base64 encoding).¹⁰

10. G. Sinclair, C. Nunnery, and B.B.H. Kang, “The Waledac Protocol: The How and Why,” in Malicious and Unwanted Software (MALWARE), 2009 4th International Conference, 2010, 6977, http://isr.uncc.edu/paper/WaledacProtocolHowWhyMalware2009_remediation.pdf.

12.1.3 Distributed Command-and-Control Systems

Early on, attackers realized that maintaining control of a compromised host was both challenging and also very important. With ongoing access facilitated by C&C channels, malware authors could use compromised hosts for a wide variety of purposes, and adapt on-the-fly as needed. Today, botherders have such excellent control over their networks that many are able to segment and sell botnets, easily transferring control to third parties.

In this section, we follow the evolution of C&C channels from simple, direct connections with central servers (i.e., IRC) to highly complex, distributed, redundant multilayer systems with built-in security features.

12.1.3.1 The Early Days: Internet Relay Chat (IRC)

Early botnets relied upon centralized control systems in which compromised agents communicated directly with a small group of central servers. Internet Relay Chat (IRC) has been a very common mechanism for malware command-and-control since the emergence of worms such as “Pretty Park” in 1999. Malware was hard-coded with server IP addresses or domain names, as well as IRC channel information, and then infected systems connected back to the central servers to drop off information about infected systems and/or receive commands and updates.

12.1.3.2 Drawbacks of Centralized C&C

Of course, centralized command-and-control channels had some major drawbacks for malware developers. They made it very easy for forensic analysts and security professionals to identify and track down compromised systems. Local organizations could simply alert on connection attempts to specific IP addresses or domains and block them, crippling the botnet. ISPs could shut down central command-and-control servers on their networks. Especially to the extent that IRC—a plain-text communications protocol—was being used, it was fairly straightforward to inspect, detect, and block that channel’s abuse. (Within most organizations, IRC is really only used by botnets, with few users engaged in anything remotely legitimate via that protocol.)

The Stuxnet worm, which targets industrial control systems primarily in Iran, is an excellent example of how a sophisticated worm can be crippled due to reliance on a centralized command-and-control system. As described by researchers at Symantec:

[The Stuxnet agent] contacts the command and control server on port 80 and sends some basic information about the compromised computer to the attacker via HTTP. Two command and control servers have been used in known samples:

www[.]mypremierfutbol[.]com

www[.]todaysfutbol[.]com

The two URLs above previously pointed to servers in Malaysia and Denmark; however they have since been redirected to prevent the attackers from controlling any compromised computers.

Symantec set up sensors in July 2010 to track connection attempts back to Stuxnet command-and-control servers. Interestingly, in August 2010 they noted a sudden drop in newly infected connection attempts from Iran, as shown in Figure 12-1. “Looking at newly infected IP addresses per day, on August 22 we observed that Iran was no longer reporting new infections. This was most likely due to Iran blocking outward connections to the command-and-control servers, rather than a drop-off in infections.” This illustrates how forensic analysts and security professionals can disrupt the operations of botnets—and other forensic analysts—simply by identifying and blocking known ports, IP addresses, and domains used for central command-and-control.¹¹

11. N. Falliere, L. Murchu, and E. Chien, “W32.Stuxnet Dossier: Version 1.4 (February 2011),” Symantec, 2011, http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dossier.pdf.

Figure 12-1 A chart from Symantec’s “W32.Stuxnet Dossier,” which illustrates a sudden drop in reported newly infected connection attempts from Iran (black). The infections were tracked based on connections to central C&C servers, which were likely blocked by Iran on August 22, 2010.¹² Reprinted with permission. [Note: Image colors have been modified to print in grayscale.]

12. Ibid.

12.1.3.3 Evolution Toward Distributed C&C

Newer botnets have been moving toward a partially distributed, rather than fully centralized, command-and-control architecture. In the distributed command-and-control model, individual endpoint agents do not connect directly back to a central server. Rather, they connect to servers in a redundant, distributed network. Distributed models typically involve a multilevel hierarchy so that systems at each level communicate with each other, and at least some systems in each level can communicate with systems in the level above. Using a distributed model, attackers can send commands and updates from a relatively small number of central servers to thousands or millions of compromised hosts, and the compromised hosts can pass information back up the chain with far greater redundancy and lower risk of detection/disruption.

12.1.3.4 Advantages of Distributed C&C

Distributed C&C offers the following advantages:

• Redundancy—If a C&C server is blocked or shut down, the functionality of the botnet is not affected.

• Detection Evasion—It is more difficult for forensic analysts to detect distributed C&C traffic. Centralized C&C traffic can be captured and analyzed at the perimeter of a LAN, where enterprises tend to focus monitoring resources, whereas a far greater percentage of decentralized C&C traffic is contained within a LAN and may never reach the perimeter firewall/IDS. Furthermore, organizations that track infections Internet-wide may never see evidence of internal botnet nodes, which communicate only between themseves. The Stuxnet worm actually incorporated some distributed C&C traffic in order to enable infections and updates of compromised hosts on internal LANs. Distributed C&C traffic is also less likely to generate a spike in traffic relating to a few systems, which commonly trigger alerts.

• Attacker Concealment—Distributed C&C channels make it far more difficult to track attacks back to central servers, since compromised endpoint nodes do not directly communicate back to central servers. In the mid-2000s, authorities such as the FBI began to investigate and prosecute botherders as part of coordinated programs including “Operation: Bot Roast.” These legal actions provided greater incentives for attackers to place more layers of indirection in between themselves and compromised botnet endpoints.^13,14

13. C. Schiller et al, InfoSecurity 2008 Threat Analysis (Burlington, MA: Elsevier, 2008), 19.

14. Dan Goodin, “FBI Logs its Millionth Zombie Address,” The Register, June 13, 2007, http://www.theregister.co.uk/2007/06/13/millionth_botnet_address/.

• Asymmetry—Distributed C&C peers can result in a sufficiently “meshed network” that the defender’s containment and eradication efforts are reduced to a frustrating and ineffective game of “Whac-A-Mole.” Small levels of effort on the attacker’s part can necessitate a disproportionate level of effort from the responders. Depending on the scale, this can be a game-changing element.

For example, the Downadup worm (also known as Conficker), which was capable of self-updating, responded to defense mechanisms by evolving increasingly distributed and dynamic C&C systems. The initial infections of W32.Downadup.A contacted one of 250 pseudorandom domains in five top-level domains each day for updates. Later, W32.Downadup.B expanded the number of top-level domains to eight.

With the update to W32.Downadup.C, the worm significantly changed its strategy for registering and querying domain names for use when downloading new instructions. This made it far more difficult for security professionals to alert upon and track its spread. As described by Porras et al. (SRI International) in their analysis of Conficker, “C further increases Conficker’s top-level domain (TLD) spread from five TLDs in Conficker A, to eight TLDs in B, to 110 TLDs that must now be involved in coordination efforts to track and block C’s potential DNS queries. With this latest escalation in domain space manipulation, C . . . represents a significant challenge to those hoping to track its census.”¹⁵

15. Phillip Porras, Hassen Saidi, Vinod Yegneswaran, “An Analysis of Conficker C,” SRI International, March 8, 2009, http://mtc.sri.com/Conficker/addendumC/.

This was a classic example of asymmetric warfare, in that the malware author had only to recode his or her instruction set to expand the array of TLDs to be searched, and to redistribute. The result was an order-of-magnitude increase in effort required by human responders in order to identify and contain a threat that was largely automated.

News headlines in April 2009 reported that “Experts Bicker Over Conficker Numbers: Is it 4.6 million infected PCs or not?”¹⁶ The number of infected systems became increasingly difficult to estimate as the malware evolved. “The Working Group got its data by setting up ‘sinkhole’ servers at points on the Internet used by infected machines to download instructions,” reported Bob McMillan of the IDG News Service. “They did this by taking over the Internet domains that Conficker is programmed to visit to search for those instructions. . . . To complicate matters further, a new variant of Conficker was spotted last week, and this one communicates primarily using peer-to-peer techniques, which are not easily measured by the Working Group’s sinkhole servers.”

16. Robert McMillian, “Experts Bicker Over Conficker Numbers,” Techworld, April 15, 2009, http://news.techworld.com/security/114307/experts-bicker-over-conficker-numbers/.

12.1.3.5 Peer-to-Peer C&C

The Storm worm took distributed command-and-control to new extremes. Designed from the start with a distributed, multilayer model, Storm incorporated the Overnet/eDonkey peer-to-peer filesharing protocol for distributed, dynamic C&C between endpoint nodes. Joe Stewart, Director of Malware Research at Dell SecureWorks, wrote, “When Storm Worm runs, it attempts to link up with other infected hosts via peer-to-peer networking. Through this conduit it gets a URL that points to a second-stage executable, which in turn downloads additional stages onto the infected system. The protocol in use is actually the eDonkey/Overnet protocol, which has been adapted by the virus author as a means to distribute the second-stage URL without being shut down as it might be if the URL was hard-coded in the body of the virus or was downloaded from another website.”^17,18

17. Joe Stewart, “Storm Worm DDoS Attack”, Dell SecureWorks, February 8, 2007, http://www.secureworks.com/research/threats/storm-worm/.

18. Joe Stewart, “Inside the Storm: Protocols and Encryption of the Storm Botnet” (Secure-Works, 2008), http://www.blackhat.com/presentations/bh-usa-08/Stewart/BH_US_08_Stewart_Protocols_of_the_Storm.pdf.

In short, the Storm botnet uses a peer-to-peer network of compromised hosts to dynamically communicate and change the locations of higher-layer C&C nodes, undercutting the effectiveness of network flow analysis. Botherders can distribute locations of many C&C servers and change them as necessary. Bruce Schneier writes, “[E]ven if a C&C node is taken down, the system doesn’t suffer. Like a hydra with many heads, Storm’s C&C structure is distributed.”¹⁹

19. Bruce Schneier, “Gathering ‘Storm’ Superworm Poses Grave Threat to PC Nets,” Wired, October 4, 2007, http://www.wired.com/politics/security/commentary/securitymatters/2007/10/securitymatters_1004.

12.1.4 Automatic Self-Updates

Malware that survives is malware that can adapt. During the late 1990s, malware emerged that was designed to automatically update its own code over the network, allowing attackers to arbitrarily change propagation strageties, payloads, and other behavior on the fly. This was an enormous leap forward. Before this time, an attacker had to reinfect or manually update systems in order to spread new code. Automatic self-updates enabled attackers to easily adapt and maintain their footholds on compromised systems by improving code and quickly distributing it to a large number of compromised systems.

12.1.4.1 Early Self-Updating Systems

The first self-updating systems were very simple. In 1999, the W95/Babylonia self-mailer worm automatically checked a web site for updates after infection. This functionality was completely destroyed when authorities disabled the web site.²⁰

20. Peter Szor, The Art of Computer Virus Research and Defense, 345.

12.1.4.2 Authenticated Updates

More sophisticated update systems began to emerge the following year. For example, in late 2000, the W95/Hybris worm was released. It was a global collaborative effort that included experienced malware authors from several countries. The W95/Hybris worm was designed to check for updates from a web site and newsgroups. The designers realized that without any sort of authentication system, defenders would be able to publish their own updates and disable the worm network. To protect against this, the W95/Hybris worm used RSA public-key encryption and a 128-bit hash algorithm. The attackers distributed a public key with the virus, and then cryptographically signed updates with the corresponding private key. Before installing updates, the compromised systems checked the integrity and authenticity of the updates.²¹

21. Peter Szor, The Art of Computer Virus Research and Defense, 347.

In 2004, the authors of the widespread MyDoom self-mailer worm learned the importance of authenticating updates. Once a system is compromised, MyDoom opens a backdoor on a TCP port and listens for connections. Attackers can connect to the backdoor and upload and execute arbitrary files.²² Backdoors of this type can allow worms to automatically distribute updated software. However, in the case of MyDoom, other malware authors took advantage of the backdoor and used it to spread their own worms (such as “W32/Doomjuice”) to systems previously infected with early versions of MyDoom.²³

22. “Threat Description: Worm:W32/Mydoom,” F-Secure, 2009, http://www.f-secure.com/v-descs/novarg.shtml.

23. Peter Szor, The Art of Computer Virus Research and Defense, 351.

12.1.4.3 Going Meta: Updating the Updating System

Modern malware uses automatic self-updates to distribute changes not just in propagation mechanisms and payloads, but also the command-and-control system itself. For example, the Waledac (late 2008) command-and-control network is a distributed hierarchy that automatically self-updates over HTTP. This includes a layer of reverse proxy systems (referred to by researchers as “TSL” servers, after a Windows registry entry used by the malware to store server locations). “Repeater nodes obtain the list of TSL servers’ IP addresses simply by asking a fellow repeater node for the current TSL server list,” wrote researchers Sinclar, Nunnery, and Kang of the University of North Carolina at Charlotte. “[T]he author(s) of Waledac sign TSL IP updates using a RSA signature. The signature uses a private key held by the attacker(s) to sign the entire payload of the TSL update (excluding the signature portion, of course). Each Waledac binary contains a copy of the public key in order to verify that unauthorized entities have not altered the contents of the TSL update. If Waledac detects that the signature does not match the calculated signature value, the binary discards the TSL IP update. The use of a public/private key pair to sign the TSL update prevents defenders from inserting themselves into the TSL tier as well as prevents defenders from disrupting the upper tier communication.”²⁴

24. G. Sinclair, C. Nunnery, and B.B.H. Kang, “The Waledac Protocol: The How and Why,” in Malicious and Unwanted Software (MALWARE), 2009 4th International Conference, 2010, 6977, http://isr.uncc.edu/paper/WaledacProtocolHowWhyMalware2009_remediation.pdf.

12.1.4.4 Success and Failure

Waledac’s automatic self-update system has been an integral component to its success. Waledac spreads through email, and also uses email as a vector to advertise products. Often, the topic of the spam emails is related to timely holidays or events, such as Christmas or Valentine’s Day, as shown in Figure 12-2. Frequent updates helped make Waledac’s social engineering tactics very effective.

Figure 12-2 Symantec’s timeline of the Waledac worm’s propagation mechanisms. The Waledac worm uses automatic self-update mechanisms to routinely distribute new spam templates. Reprinted with permission.²⁵

25. Gilou Tenebro, “”Waledac: An Overview,” Symantec, August 12, 2010, http://www.symantec.com/connect/blogs/waledac-overview.

Waledac’s automatic self-update system has also been key to its demise. Although Waledac cryptographically signs lists of “TSL” servers, the lower-layer peer-to-peer networks of compromised Waledac systems do not cryptographically validate lists of peer IP addresses exchanged through the network. Eventually, researches were able to infiltrate lower layers of the Waledac network by setting up their own fake Waledac bots. The IP addresses of the fake Waledac bots were propagated through the peer-to-peer exchange system.

As part of Microsoft’s “Operation b49” coordinated takedown effort in early 2010, researchers used fake Waledac bots to “poison” the lower-layer peer-to-peer command-and-control system so that “all communication to the C&C infrastructure [was] redirected to [the Operation b49] infrastructure.” This tactic, combined with an effective dismantling of the Waledac fast-flux network, cut off compromised endpoints so that they could not receive updates and commands through the Waledac C&C channels.²⁶ Although the endpoint nodes were still infected, once cut off from the command-and-control channel, they could no longer receive updates and were outside the control of the botherders.^27,28

26. Ben Stock, “If Only Botmasters Used Google Scholar . . . The Takedown of the Waledac Botnet,” March 10, 2011, http://www.enisa.europa.eu/act/res/botnets/workshop-presentations/ben-stock-presentation.

27. Nick Wingfield and Ben Worthen, “Microsoft Battles Cyber Criminals—WSJ.com,” Wall Street Journal, February 26, 2010, http://online.wsj.com/article/SB10001424052748704240004575086523786147014.html?mod=WSJ_hps_sections_business.

28. Gilou Tenebro, “”Waledac: An Overview,” Symantec, August 12, 2010, http://www.symantec.com/connect/blogs/waledac-overview.

12.1.5 Metamorphic Network Behavior

As network security professionals and forensic investigators leveraged flow analysis to detect and defend against malware, malware developers began implementing countermeasures to evade network flow analysis techniques.

Early on, malware exhibited relatively static behavior on the network using specific ports and protocols that could be identified and used by network analysts to develop antivirus and IDS signatures. For example, W32/Blaster could be detected by searching for unexpected traffic on TCP 135, TCP 4444, and/or UDP 69; W32/Witty could be found by alerting on source port UDP 4000.²⁹ Zombie remote control software such as BackOrifice and Sub7 allowed attackers to customize ports and payloads for individual compromised systems, but frequently these daemons were installed on default ports such as TCP 31337 (BackOrifice) or TCP 27573 (Sub7).³⁰ Generally, malware with static network behaviors can be easily blocked through additional firewall rules/router ACLs (although some defenders did not implement these until it was far too late).

29. Peter Szor, The Art of Computer Virus Research and Defense, 589.

30. Joakim von Braun, “SANS: Intrusion Detection FAQ: What port numbers do well-known trojan horses use?,” February 9, 2001, http://www.sans.org/security-resources/idfaq/oddports.php.

Over time, malware evolved to incorporate a variety of techniques that resulted in dynamically changing network behavior. Sometimes this was done on purpose to avoid network-based detection, forensics, and/or containment, and other times it was simply a by-product of added features or functionality. Types of behavior that result in dynamically changing network activity include multiple propagation strategies, variable daemon ports, and sophisticated scanning algorithms.

12.1.5.1 Multiple Propagation Strategies

Compromised systems can include multiple methods for exploiting new systems, foiling simple port-blocking, router ACLs, and IDS rules. For example, Nimda (2001) spread very quickly partly due to multiple propagation strategies, including infection of vulnerable IIS web servers, bulk emails, scanning for back doors left by other worms, open network shares, and more.^31,32 This created a variety of patterns on the network, including HTTP traffic, SMTP traffic, filesharing traffic, and more.

31. Nikolai Joukov and Tzi-cker Chiueh, “INTERNET WORMS AS INTERNET-WIDE THREAT,” 2003, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.1807&rep=rep1&type=pdf.

32. “CERT Advisory CA-2001-26 Nimda Worm,” Carnegie Mellon University, September 25, 2001, http://www.cert.org/advisories/CA-2001-26.html.

12.1.5.2 Variable Daemon Ports

Since port-blocking is one of the simplest and oldest methods of attempting to contain the spread of malware, many forms of malware have been developed to dynamically change their command-and-control and malware distribution ports. For example, W32.Downadup.C (2009) is controlled using a peer-to-peer C&C system. To communicate with other compromised hosts, Microsoft reports that “[W32.Downadup.C] opens four ports on each available network interface. It opens two TCP and two UDP ports. The port numbers of the first TCP and UDP ports are calculated based on the IP address of the network interface. The second TCP and UDP ports are calculated based on the IP address of the network interface as well as the current week, leading to this second set of ports to change on a weekly basis.”³³ This strategy of changing port numbers makes it difficult for defenders to effectively implement port-blocking and router ACLs, which might interfere with the worm’s peer-to-peer communication, since communication ports vary not only for individual infected hosts but also change over time. This also makes it harder for forensic analysts to trace malware activity when analyzing flow data after the fact.

33. “Worm:Win32/Conficker.D,” Microsoft, April 17, 2011, http://www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Worm%3aWin32%2fConficker.D.

12.1.5.3 Sophisticated Scanning for New Targets

Since the early days of automated propagation, worms were designed to scan the IP address space for new targets (and eventually, to identify infected peers in a distributed bot network). The variety of strategies and methods for scanning act as a malware “fingerprint” and help researchers detect and identify the malware. Security professionals at ISS remarked as early as 2003 that “Early mapping tools were very easy to detect, and were very easy to trace back to the source. Next generation tools employed ‘stealth’ techniques in an attempt to map hosts and services by sending a storm of spoofed probes amid one legitimate probe to actually collect the response.”³⁴

34. “‘Stumbler’ Distributed Stealth Scanning Network,” IBM Internet Security Systems, June 19, 2003, http://www.iss.net/threats/advise146.html.

You can use network forensic techniques to identify, classify, and track down certain types of malware. However, as we will discuss, several malware authors have developed sophisticated techniques to obfuscate network scanning and hide the sources.

Randomized Scanning

Some malware, such as W32/Welchia (2003), scans a list of randomly generated IP addresses that are limited to a subset of the IPv4 address space, depending on the size and address space of the local network. For example, in Class B networks, Welchia scans a Class B–sized address space that is the same or near the current infected system.³⁵ You can detect the presence of the Welchia worm (and many other worms) by looking for ARP requests from one host looking for a succession of consecutive IP addresses.³⁶ This will also help you identify the infected system itself by its MAC address and/or IP address.

35. Peter Szor, The Art of Computer Virus Research and Defense.

36. “Detecting Network Traffic That May Be Due to RPC Worms,” Internet Security Promotions, September 13, 2003, http://internetsecuritypromotions.com/malware/1749.

In contrast, SQL Slammer (2003) used a randomized IPv4 address-generation mechanism, so it had a very different network “fingerprint.” Alerting on ARP requests for consecutive IP addresses would not work for SQL Slammer.

Although SQL Slammer was very effective for its time—infecting 75,000 systems within 10 minutes³⁷—randomized scanning is not the most efficient scanning strategy. As discussed by Staniford, Paxon, and Weaver in 2002, randomized scanning by compromised hosts leads to repetition and inefficiency, and it may not be possible for scanners to tell when all vulnerable hosts within a targeted range have been exploited. Indeed, the Slammer worm generated so much traffic that it unintentionally caused widespread denial-of-service outages due to network congestion. You can easily identify Slammer (W32.SQLExp.Worm) by alerting on its scanning/exploitation activity, high-volume, single-packet transmissions to UDP 1434 on randomized IP addresses.³⁸

37. “SQL Slammer,” Wikipedia, June 9, 2011, http://en.wikipedia.org/wiki/SQL_Slammer.

38. Paul Boutin, “Wired 11.07: Slammed!,” Wired, July 11, 2003, http://www.wired.com/wired/archive/11.07/slammer.html.

Permutation Scanning

Instead of purely randomized scanning, Staniford et al. suggest a permutation scan strategy in which “all worms share a common pseudo random permutation of the IP address space. . . . [Newly infected machines] start scanning just after their point in the permutation, working their way through the permutation, looking for vulnerable machines. Whenever the worm sees an already infected machine, it chooses a new, random start point and proceeds from there. . . . This has the effect of providing a self-coordinated, comprehensive scan while maintaining the benefits of random probing. Each worm looks like it is conducting a random scan, but it attempts to minimize duplication of effort.”³⁹

39. Stuart Staniford, “How to Own the Internet in Your Spare Time,” 2002, http://www.icir.org/vern/papers/cdc-usenix-sec02/.

Spoofed Scanning

For attackers, one of the biggest drawbacks of scanning is that typically the process of sending out traffic in order to gather information also reveals information about the compromised host. Forensic analysts and security professionals detect network scanning and use it to track down compromised systems.

In 2003, the Stumbler malware demonstrated an effective strategy for concealing scanner systems. The Stumbler malware was designed to passively “sniff” network traffic on the local segment using the libpcap library. It sends out TCP SYN segments to random ports on IP addresses, with spoofed source IP address and source MAC address fields. (The segments are unusual in that the TCP Window Size is set to 55808.) Then, the agent passively listens for responses (which may also have been generated in response to other Stumbler agents). The collected network environment information is sent back to a central server.

At the time, the mysterious spoofed packets were so widespread and difficult to track that the news media wrote, “As the scanning activity increased it gained more attention from the security industry, the FBI and the Department of Homeland Security. Everyone wanted to know what this scanning activity was about and where it was coming from.”⁴⁰

40. Tony Bradley, “Researchers ‘Stumble’ Onto Mystery Trojan,” About, 2003, http://netsecurity.about.com/cs/virusesworms/a/aa062003.htm.

While the Stumbler software was nondestructive, it demonstrates the extent to which network forensic analysts must take evidence with a grain of salt: Many aspects of frames and packets are easy to spoof, including source IP addresses and MAC addresses. These pieces of evidence can help to build a cohesive picture of a case, but always look at them in context. If the protocol details or contents of a packet do not make sense given information about the local address space, network activity, or case timing, you may need to analyze multiple sources of evidence in order to understand the true story.

Distributed Scanning Networks

Stumbler also demonstrated in 2003 that network scanning agents can be distributed in a peer-to-peer network and used to funnel network mapping information back to central servers. While Stumbler’s techniques were rudimentary—it was trivial to identify the central server through network analysis of SSH connections from the scanner back to a central host—it also proved that the concept was valid. Agents were able to gather reconnaissance about local area networks and Internet-facing hosts using a distributed peer-to-peer arrangement. The ability of agents to listen for responses resulting from the activities of other agents provided redundancy and also dramatically increased the speed at which a scan could complete.^41,42

41. Nikolai Joukov and Tzi-cker Chiueh, “INTERNET WORMS AS INTERNET-WIDE THREAT,” 2003, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.1807&rep=rep1&type=pdf.

42. “‘Stumbler’ Distributed Stealth Scanning Network,” IBM Internet Security Systems, June 19, 2003, http://www.iss.net/threats/advise146.html.

From a forensics perspective, it is important to recognize that botnet functionality can be modular to a very high degree. Malware authors can release completely separate malware designed to conduct reconnaissance and harvest target information, or simply include this functionality into a special set of agents, separate from those propagating the malware or its primary payload. Even if the scanning agents are found, these may only be pieces of the botnet.

Dynamic Timing/Volume

Most network flow-based malware detection techniques rely on detecting unexpected or unexplainable increases in network activity. However, it is possible for malware to evade detection for extended periods of time by dynamically changing the timing and volume of its scanning activity. In 2006, researchers at the Ohio State University explored the concept of a “Camouflaging Worm” or “C-Worm” in-depth, writing: “The C-Worm has the ability to camouflage its propagation by intelligently manipulating its scanning traffic volume over time so that its propagation goes undetected by the existing worm detection schemes.”^43,44,45

43. W. Yu et al., “On Detecting Camouflaging Worm” (2006), http://www.oar.net/initiatives/research/PDFs/cworm_acsac06.pdf.

44. Munir Kotadia, “Smart Worm Lies Low to Evade Detection,” ZDNet UK, July 13, 2004, http://www.zdnet.co.uk/news/security-management/2004/07/13/smart-worm-lies-low-to-evade-detection-39160285/.

45. S. Singh et al., “Automated Worm Fingerprinting,” (University of California, 2004), http://cseweb.ucsd.edu/~savage/papers/OSDI04.pdf.

It may be the case that professional organizations have developed complex worms that intelligently vary IP-address scanning speeds to blend with normal network activity and avoid detection. However, these have not been widely reported, either because defenders lack the tools and capabilities to detect them or because malware developers have found that simpler tools are currently sufficient to achieve their objectives. Of course, malware is typically only as sophisticated as it needs to be to get the job done. By releasing malware, developers also risk having their code discovered, analyzed, and disseminated, reducing its effectiveness and value. It is strategic to wait and deploy new features as needed.

The Downadup worm, which was designed to automatically update itself through the network, did stop its own propagation⁴⁶ with the update to W32.Downadup.C. At the same time, it also added the ability to distribute cryptographically signed updates through a robust peer-to-peer network of infected hosts. The result was that the network profile of Downadup changed dramatically. As reported by Symantec at the time, “One interesting aspect of W32.Downadup.C is the omission of a propagation routine; this coincided with public reports of a decrease in TCP port 445 activity as of March 5, 2009. The decrease in TCP port 445 activity would be expected, since W32.Downadup.A and W32.Downadup.B both had aggressive propagation routines, and W32.Downadup.C does not.” Figure 12-3 illustrates the reported drop in TCP port 445 traffic during March 2009.

46. Justin Ma, “Self-Stopping Worms” (University of California, 2005), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.2134&rep=rep1&type=pdf.

Figure 12-3 Symantec’s 2009 illustration of the rise and fall in port 445 traffic. Researchers at Symantec hypothesized that the drop in TCP port 445 traffic on March 5, 2009, was due to the Downadup worm’s auto-update, which removed its own propagation mechanism. Reprinted with permission.⁴⁷

47. “W32.Downadup.C Bolsters P2P,” Symantec, June 29, 2009, http://www.symantec.com/connect/blogs/w32downadupc-bolsters-p2p#A253.

At the same time, Symantec’s researchers also noted a significant increase in the amount of UDP traffic generated for ports over 1024 (see Figure 12-4). They hypothesized that “The large increase in UDP activity indicates that a significant number of systems infected with W32.Downadup.B began performing UDP P2P peer discovery to random target IPs. This is the behavior of the initial P2P setup (bootstrap) routines for W32.Downadup.C.”⁴⁸

48. “W32.Downadup.C Bolsters P2P.”

Figure 12-4 Symantec’s 2009 illustration of the rise in UDP activity for ports greater than 1024. This coincided with the introduction of UDP peer-to-peer discovery in Downadup-infected systems. Reprinted with permission.⁴⁹

49. Ibid.

These changes to the network profile of Downadup made it more difficult for forensic analysts and security professionals to detect and track the Downadup worm. Since W32.Downadup.C generated far more domain names than defenders could monitor, and because it no longer produced high volumes of TCP 445 traffic, the mechanisms already in place to detect Downadup infections ceased to be effective for the newly updated versions.

12.1.6 Blending Network Activity

Over time, as network-based malware detection and blocking techniques became more sophisticated and widely available, malware developers took steps to disguise network activity by blending in with normal network activity. Propagation strategies and C&C traffic have evolved to blend better with legitimate traffic. This also helps to ensure more reliable propagation and command-and-control channels, since it is harder to implement defenses against malware that is similar to normal traffic.

As of November 2010, Team Cymru reported that “Web-controlled botnets now outnumber those controlled by the traditional method of IRC channel by a factor of five.”⁵⁰ The reasons are simple: Now that web traffic makes up over half of all Internet traffic as of 2009, ports 80 and 443 are a great place to hide.⁵¹ (Did we mention that it’s a port-80 world out there?) Most organizations cannot or will not block HTTP/HTTPS traffic at the perimeter, and their users routinely make legitimate, varied HTTP/HTTPS connections to servers all over the world. “HTTP-based botnets often use ports (e.g., port 80 of course) that are unblocked on most networks and also hard to filter and easy to hide in a sea of noise,” remarked Steve Santorelli of Team Cymru.⁵²

50. John Leyden, “IRC Botnets Dying Off,” The Register, November 16, 2010, http://www.theregister.co.uk/2010/11/16/irc_botnets_dying_off/.

51. Craig Labovitz, “Internet Traffic and Content Consolidation,” 2007, http://www.ietf.org/proceedings/77/slides/plenaryt-4.pdf.

52. John Leyden, “IRC Botnets Dying Off,” The Register, November 16, 2010, http://www.theregister.co.uk/2010/11/16/irc_botnets_dying_off/.

12.1.6.1 Storm/Waledac C&C Protocol Evolution

The evolution of the Storm botnet clearly illustrates the shift in C&C strategy. The Storm botnet’s peer-to-peer C&C system was based on the “Overnet” UDP protocol. This was easy to detect in corporate networks, where peer-to-peer filesharing is typically not allowed. In addition, as researcher Joe Stewart pointed out, Storm’s introduction of weak encryption actually made it easier to “distinguish this new Storm traffic from ‘legitimate’ (cough) Overnet P2P traffic . . . [which] makes it easier for network administrators to detect Storm nodes on networks where firewall policies normally allow P2P traffic.”⁵³

53. Joe Stewart, “The Changing Storm,” Dell SecureWorks, October 14, 2007, http://www.secureworks.com/research/blog/index.php/2007/10/15/the-changing-storm/.

With the release of Waledac in late 2008, widely considered the next generation of the Storm botnet, malware authors scrapped the UDP Overnet-based traffic in favor of an HTTP-based C&C system. “P2P was part of the reason for Storm’s demise. It was easy to filter it,” said Jose Nazario of Arbor Networks. “With HTTP, it’s a little harder because you’ve got to know what you’re looking for.”⁵⁴

54. Kelly Jackson Higgins, “Storm Botnet Makes A Comeback,” Dark Reading, January 14, 2009, http://www.darkreading.com/security/vulnerabilities/212900543/index.html.

Waledac’s HTTP communications are carefully crafted to reduce the risk of detection. “In communicating with other nodes, this malware uses HTTP POST and GET messages. Except for the headers, the contents of the HTTP messages are usually obfuscated,” writes Symantec researcher Gilou Tenebro. “Messages are sent using the HTTP protocol and the header of the HTTP requests use Mozilla as a Referer and/or User-Agent string. This is done to make it look like the W32.Waledac traffic came from a Mozilla browser. It is just another attempt to hide its presence and avoid suspicion.”⁵⁵

55. Gilou Tenebro, “W32.Waledac Threat Analysis,” Symantec, 2009, http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/W32_Waledac.pdf.

12.1.6.2 Downadup C&C

The Downadup worm, released in late 2008 around the same time as Waledac, also illustrates the use of web-based C&C. Downadup is designed to use HTTP requests to download updates from C&C servers. With early releases of Downadup, every infected system made requests to a relatively small pool of synchronized pseudorandomly generated domain names each day. Defenders learned to predict the daily domain names and monitor for the suspicious HTTP requests. Eventually, Downadup evolved so that each infected system attempted connections to only a small percentage of a large pool of 50,000 possible domain names each day, making it nearly impossible for security professionals to configure detection systems to keep up. Downadup also evolved to use shorter domain names (later versions used domains that were 4–9 characters long, whereas previous versions had used domains 8–11 characters long). This made it more difficult for defenders to come up with heuristics distinguishing the worm’s C&C domains from legitimate DNS requests. Indeed, The Honeynet Project generated a list of conflicts with real domains in April 2009, and there were nearly 200 each day.^56,57

56. Tillmann Werner and Felix Leder, “Know Your Enemy: Containing Conficker,” The Honeynet Project, April 7, 2009, http://www.honeynet.org/files/KYE-Conficker.pdf.

57. Tillmann Werner and Felix Leder, “Informatik 4: Containing Conficker,” Universität, January 21, 2011, http://net.cs.uni-bonn.de/wg/cs/applications/containing-conficker/.

Interestingly, later variants of Downadup introduced a UDP-based P2P update distribution mechanism, precisely the opposite of the Storm/Waledac evolution. This was likely due to extreme defensive measures targeted at intercepting Downadup’s original HTTP-based distribution mechanisms. Downadup’s P2P distribution mechanism made it much harder for external organizations to track the botnet, while arguably making it easier for local organizations to detect it within their internal networks through the use of Snort signatures and similar IDS/IPS techniques.

12.1.6.3 Social Networking Sites

Malware has also evolved to blend with “normal” traffic by leveraging social networking sites like Twitter and Facebook. “Usually social networks cannot be blacklisted because of the number of legitimate users,” writes Kathryn Stevens (NCSI). “Social networks are becoming the easiest, cheapest and most reliable infrastructure option for bot herders.”⁵⁸ For example, the Koobface worm, which first emerged in late 2008, spreads through Facebook, MySpace, Twitter, and other social networking sites. It reportedly generated over $2 million in revenue for operators, as a result of click fraud.⁵⁹

58. Kathryn Stephens, “Malware Command and Control Overview,” December 30, 2010, http://www.nsci-va.org/WhitePapers/2010-12-30-Malware%20C&C%20Overview-Stephens.pdf.

59. Nart Villeneuve, “Koobface: Inside a Crimeware Network,” Infowar Monitor, November 12, 2010, http://www.infowar-monitor.net/reports/iwm-koobface.pdf.

12.1.7 Fast-Flux DNS

“Fast-flux” service networks are designed to dynamically change and obscure central malware server IP addresses to make it harder for defenders to block malicious traffic, and track down the central attacker servers.

One of the earliest ways that network administrators detected and blocked malware was by building and distributing blacklists of “bad” IP addresses known to be hosting malware or acting as control systems for botnets. Firewalls and IDS systems were configured to alert when an internal host attempted to contact the blacklisted IP addresses. After a network compromise, investigators could identify other infected systems by looking for additional connections to the same known “bad” IP addresses. When a malware sample was analyzed, a list of central C&C IP addresses could be retrieved and blocked.

In response, attackers shifted to using DNS records, which allowed them to change the underlying IP addresses as needed. Over time, “fast-flux” service networks emerged. In fast-flux networks, malware is designed to refer to domains rather than IP addresses. The DNS records for each domain point to many IP addresses with low TTL values, typically 5–10 IP addresses at a time, which are swapped out every 5–10 minutes, or as needed. The IP addresses listed as the “A” records for each domain are themselves usually compromised systems, which can be load-balanced according to bandwidth capabilities and removed from the network when they become unresponsive. These systems can act as proxies for higher-layer C&C and malware distribution systems, including protocols such as HTTP, SMTP, POP, IMAP, DNS, and more. In this way, an attacker can utilitize thousands of compromised systems in order to provide redundancy and additional layers of protection for a botnet, malware distribution network, or other criminal network activities.⁶⁰

60. “Know Your Enemy: Fast-Flux Service Networks,” The Honeynet Project, July 13, 2007, http://www.honeynet.org/book/export/html/130.

As described by The Honeynet Project, in fast-flux service networks, “front-end nodes are disposable criminal assets that can offer a layer of protection from ongoing investigative response or legal action. When a security professional is responding to an incident and attempts to track down a malicious website hosted via a fast-flux service network, they typically recover only a handful of IP addresses corresponding to disposable front-end nodes which may be spread across multiple jurisdictions, continents, regional languages and time zones, which further complicates the investigation. Because of the proxy redirection layer, an electronic crimes investigator or incident responder will often find no local evidence of the hosting of malicious content on compromised front end systems, and traffic logging is usually disabled so audit trails are also limited.”⁶¹

61. Ibid.

“Single-flux” networks dynamically change the DNS “A” records of a domain. “Double-flux” networks add an extra layer of indirection by dynamically changing both the DNS “A” records and the “NS” (nameserver) records as well. In double-flux networks, when a client system makes a request for the malware’s domain, it is typically forwarded to an authoritative DNS server, which is actually a temporarily assigned compromised host that forwards the DNS request to a higher-layer server in the malware network. This DNS server returns the latest fast-flux A records for the domain, which are relayed back to the requesting client.

Fast-flux networks interfere with network forensic techniques based on source/destination IP address filtering and pattern matching. As they become increasingly common, network forensic investigators are challenged to come up with new methods for tracking malware activity, scoping breaches, and identifying compromised systems.

12.1.8 Advanced Persistent Threat (APT)

In January 2010, Google announced that it had been the victim of a “a highly sophisticated and targeted attack”⁶² originating from China, which resulted in theft of sensitive intellectual property. This attack, dubbed “Operation Aurora” by McAfee, is known to have resulted in the compromise of over 34 “major financial, defense and technology companies and research institutions in the United States,”⁶³ including Symantec, Adobe, Northrup Grumman, Morgan Stanley, Dow Chemical, Rackspace, Juniper Networks, and, of course, Google.⁶⁴

62. David Drummond, “A New Approach to China,” The Official Google Blog, January 12, 2010, http://googleblog.blogspot.com/2010/01/new-approach-to-china.html.

63. Ariana Eunjung Cha and Ellen Nakishima, “Google China Cyberattack Part of Vast Espionage Campaign, Experts Say,” Washington Post, January 14, 2010, http://www.washingtonpost.com/wp-dyn/content/article/2010/01/13/AR2010011300359.html.

64. “Operation Aurora,” Wikipedia, July 13, 2011, http://en.wikipedia.org/wiki/Operation_Aurora.

In the ensuing 2010 media storm, antivirus vendors, security product firms, and reporters popularized the term “advanced persistent threat” (APT), which had already been widely used in security and military circles to describe sophisticated network infiltration attacks.

12.1.8.1 Early Usage of the Term “APT”

Although the origin of the term “advanced persistent threat” is not clearly defined, there is general consensus that it emerged from military use. Security researcher Richard Bejtlich published an article using the phrase as early as October 2007, in which he began, “This week I attended Victory in Cyberspace, an event held at the National Press Club. It centered on the release of a report written by Dr. Rebecca Grant for the Air Force Association’s Eaker Institute.”⁶⁵

65. Richard Bejtlich, “Air Force Cyberspace Report,” TaoSecurity, October 12, 2007, http://taosecurity.blogspot.com/2007/10/air-force-cyberspace-report.html.

The Air Force’s 2007 “Victory in Cyberspace” report included a classification of cybersecurity threats, outlined by a “Strategic Command” (STRATCOM) official. “Tier I consisted of ‘kiddy hackers,’ talented but mostly nonpolitical individuals cracking the net. Tier II comprised operators with more advanced skills, but with capabilities that are much less imposing than those of a nation state. Tier III were the peer competitors with ‘NSA-like capabilities plus nation state resources’ behind them. The United States, Britain, Russia, China, and a smattering of European countries all fit into this Tier III compartment.”⁶⁶

66. Rebecca Grant, “Victory in Cyberspace” (U.S. Air Force, October 2007), http://www.afa.org/media/reports/victorycyberspace.pdf.

The “Victory in Cyberspace” report also includes a revealing discussion of cybersecurity budgeting. “What will it take to make cyberspace secure? Several Department of Defense officials . . . report that massive efforts are underway to plan and budget for increasing cyberspace capabilities. . . . ‘We are well past the $5 billion per year mark, and I don’t know what the top end is,’ commented one STRATCOM official. ‘The $5 billion is mostly on defense. We buy huge amounts of software and people to run that, but it’s totally ineffective against Tier III cyber [advanced persistent] threats,’ this official noted.”⁶⁷

67. Ibid.

12.1.8.2 Definition of APT

While there is no canonical definition for “advanced persistent threat,” it can be described as follows:

• Advanced Highly sophisticated, often incorporating zero-day attacks or cutting-edge technology for which there are no widespread defensive capabilities. Note that attackers do not necessarily deploy the most advanced techniques in their arsenal for a given target; rather, they select techniques with the level of sophistication necessary to achieve their goals. APTs are typically multistage coordinated attacks that require high levels of reconnaissance and result in long-term infiltration.

• Persistent Long-term, stealthy, targeted, ongoing attacks designed to achieve a high rate of success with minimal risk of detection. APTs include extensive reconnaissance (both passive and active), and are designed to result in long-term undetected compromise of the target in order to facilitate further monitoring or launch additional attacks. The targeted organization may detect one isolated instance of the compromise without recognizing the presence of an APT throughout the network.

• Threat Well-trained and disciplined human attackers, working to achieve a specific goal. One of the quintessential aspects of APT is that attacks are not aimed at targets of opportunity but, rather, targets of specific interest.

12.1.8.3 Early Examples of “APT”

An early example of a “Tier III” or APT attack was Titan Rain, actually a series of thousands of attacks during 2003 until 2005 designed to infiltrate United States government and defense-related networks. Titan Rain, as titled by the United States government, was suspected to have been coordinated by the People’s Republic of China, although U.S. officials also said that they may have been launched by “just a bunch of disconnected hackers” leveraging the Chinese IP address space.⁶⁸ According to Time magazine, compromised organizations included Lockheed Martin, Sandia, Redstone Arsenal military base, NASA, and the World Bank.⁶⁹

68. Bradley Graham, “Hackers Attack Via Chinese Web Sites,” Washington Post, August 25, 2005, http://www.washingtonpost.com/wp-dyn/content/article/2005/08/24/AR2005082402318.html.

69. Nathan Thornburgh, “The Invasion of the Chinese Cyberspies,” TIME Magazine, August 29, 2005, http://www.time.com/time/magazine/article/0,9171,1098961,00.html.

Reports from security researchers indicated that the Titan Rain attackers displayed unusual levels of skill, organization, and sophistication. “[Shawn Carpenter, security analyst at Sandia National Laboratories,] had never seen hackers work so quickly, with such a sense of purpose,” reported Time magazine in 2005. “They would commandeer a hidden section of a hard drive, zip up as many files as possible and immediately transmit the data to way stations in South Korea, Hong Kong or Taiwan before sending them to mainland China. They always made a silent escape, wiping their electronic fingerprints clean and leaving behind an almost undetectable beacon allowing them to re-enter the machine at will. An entire attack took 10 to 30 minutes.”⁷⁰

70. Ibid.

Piecing together news reports, it is clear that Titan Rain attacks were multistage, involving both passive and active reconnaissance. For example, Time magazine reported on a November 1, 2004, attack in which the Titan Rain attackers launched a vulnerability scanning tool which scanned “vast military networks for single computers with vulnerabilities that the attackers could exploit later. . . . After performing the scans . . . the attackers returned within a day or two and, as they had on dozens of military networks, broke into the computers to steal away as much data as possible without being detected.”⁷¹

71. N. Thornburgh, “Titan Rain: Chinese Cyberespionage?,” TIME Magazine, August 25, 2005, http://www.time.com/time/nation/article/0,8599,1098371,00.html.

The skilled “Titan Rain” attacks were right in line with expectations set by the Pentagon’s 2005 annual report to Congress, which discussed China’s development of cybersecurity attack and defense capabilities. “China’s computer network operations (CNO) include computer network attack, computer network defense, and computer network exploitation,” described the report. “The [Chinese People’s Liberation Army (PLA)] has likely established information warfare units to develop viruses to attack enemy computer systems and networks, and tactics to protect friendly computer systems and networks. The PLA has increased the role of CNO in its military exercises. Although initial training efforts focused on increasing the PLA’s proficiency in defensive measures, recent exercises have incorporated offensive operations, primarily as first strikes against enemy networks.”^72,73

72. “ANNUAL REPORT TO CONGRESS: The Military Power of the People’s Republic of China 2005,” Office of the Secretary of Defense, United States of America, 2005, http://www.defenselink.mil/news/Jul2005/d20050719china.pdf.

73. Bradley Graham, “Hackers Attack Via Chinese Web Sites,” Washington Post, August 25, 2005, http://www.washingtonpost.com/wp-dyn/content/article/2005/08/24/AR2005082402318_2.html.

12.1.8.4 Evolution of APT

Operation Aurora (2009/2010) marked a shift in the attack landscape: Whereas once APTs were primarily targeted at government and military operations, defenders suddenly found that the same techniques were being used to target private sector enterprises. McAfee, in its Operation “Aurora” announcement, pointed out that “These highly customized attacks known as ‘advanced persistent threats’ (APT) were primarily seen by governments. . . . Operation Aurora is changing the cyberthreat landscape once again. These attacks have demonstrated that companies of all sectors are very lucrative targets. Many are highly vulnerable to these targeted attacks that offer loot that is extremely valuable: intellectual property.”⁷⁴

74. George Kurtz, “Operation Aurora Hit Google, Others,” McAfee | Blog Central, January 17, 2010, http://blogs.mcafee.com/corporate/cto/operation-aurora-hit-google-others.

Beyond intellectual property, private enterprises are deeply intertwined with national security and, indeed, global information security. For example, IT companies targeted in the Operation Aurora attacks, such as Symantec, Juniper, Google, and Adobe, themselves have footholds deep within corporate and government networks around the world. By compromising the private enterprises that provide IT equipment, software, security, and services for networks around the world, attackers can “piggyback” off the widespread access of legitimate companies and leverage their systems to gain far-reaching, deep access to networks around the globe.

The 2011 compromise of RSA Security illustrated how attackers are now leveraging APT to undermine globally deployed IT security and software solutions. On March 18, 2011, RSA announced that they had been the victim of an “extremely sophisticated cyber attack . . . in the category of an Advanced Persistent Threat (APT).” The attackers stole sensitive information from RSA’s systems that “specifically related to RSA’s SecurID two-factor authentication products. . . . [T]his information could potentially be used to reduce the effectiveness of a current two-factor authentication implementation as part of a broader attack. We are very actively communicating this situation to RSA customers and providing immediate steps for them to take to strengthen their SecurID implementations.”^75,76

75. Arthur W. Coviello, Jr., “Open Letter to RSA Customers,” EMC Corportation, 2011, http://www.rsa.com/node.aspx?id=3872.

76. Kim Zetter, “Hacker Spies Hit Security Firm RSA,” Wired.com, March 17, 2011, http://www.wired.com/threatlevel/2011/03/rsa-hacked/.

The RSA two-factor authentication system is used by organizations worldwide that have unusually high security requirements and budgets. While two-factor authentication systems using hardware tokens are currently a “best practice” for remote access to networks, the vast majority of organizations do not have the resources or incentives to invest in hardware tokens or support staff. As a result, the organizations most affected by the RSA compromise are those that have relatively high security requirements, and budgets to match.

Organizations that use single-factor authentication schemes (such as passwords) are easily compromised through password theft (keyloggers, phishing attacks) or brute-force password-guessing attacks. Two-factor hardware-based authentication foils these attacks, making it much harder for attackers to break into targeted systems. As a result, attackers undoubtedly have an interest in undermining popular two-factor hardware authentication schemes.

By targeting RSA’s SecurID system, the attackers have executed one step in a broader, longer-term attack plan. The larger plot almost certainly involves researching ways to bypass, circumvent, or forge RSA SecurID two-factor authentication systems in order to compromise organizations that use them for authentication. The attacker was clearly willing to spend considerable resources compromising RSA in order to gain access to information that can be used to launch attacks against two-factor authentication systems used to protect high-security environments. This is a clear indication of a well-funded, dedicated attacker employing APT as part of a long-term, planned, coordinated attack.

12.2 Network Behavior of Malware

Automated networks of propagating agents—many with sophisticated C&C channels and frameworks for extensible behavior—have become part of the Internet ecosystem. The bad news for the defender is that the authors of today’s malware are sophisticated in their strategies. The good news for the forensic analyst is that there is one thing you can count on, it’s that sooner or later the malicious agent will create or modify network traffic.

Measurable differences exist in the network traffic local to a system before and after it becomes host to a hostile and network-active agent. With sufficiently granular historical data, state changes in system behaviors can become actionable indicators of compromise over time. A useful corollary might be that if an enterprise employs sufficiently rigorous image- and patch-management, we can expect that there would be measurable differences in the network traffic involving systems that had been compromised compared with systems that had not.

As we have seen, however, malware authors are improving their techniques for blending with normal traffic and piggybacking on common activities such as web surfing. In general, malware developers tend to invest only the effort needed to get the job done and introduce new evasion features in response to defensive activities. As a result, nontargeted, widespread attacks are still not very stealthy and can easily be detected using network forensics techniques. The rise of targeted, skilled, stealthy attacks such as those used in APT pose a greater challenge, which will grow in the years to come.

In this section, we discuss the types of network activity generated by malware propagation, C&C, and payloads. We review strategies that network forensic investigators can use to detect and analyze malware behavior on the network. Since the behavior of malware has become extremely diverse, you may need to employ network forensics techniques from any or every chapter in this book.

12.2.1 Propagation

As we have seen, malware developers are constantly inventing creative new ways to load their code onto victim devices. Some of the most common vectors for propagation today include:

• Email

• Web links and content (see Figure 12-5 for examples of blog spam)

Figure 12-5 Examples of blog SPAM on a Wordpress comment moderation page. Malware has emerged that spreads using the web and social networking sites. It’s not just email and port scanning anymore, Dorothy.

• Network shares

• Direct network-based exploitation (often preceded by vulnerability scanning)

To identify malware propagation, you can search for signatures of the malware as it is transferred across the network, including packet payload content, sizes of transferred data, statistical flow analysis of ports and targeted addresses, and more. Often, evidence of malware propagation is filtered by web proxies or email proxies, particularly as client-side attacks continue to increase.

A classic example of direct network-based exploitation is the SQL Slammer worm (2003), which compromised Microsoft SQL servers. To propagate, Slammer sent a single UDP packet containing an exploit to UDP port 1434 at randomized IP addresses. It was a “one-shot, one-kill” attack, with no additional port scanning or traffic needed. The simplicity of Slammer was key to its explosive spread.

Forensic analysts can easily identify Slammer traffic by filtering for UDP port 1434 packets with content that matches the Slammer exploit payload. Here is the published Snort rule:⁷⁷

77. “Snort,” 2010, http://snort.org.

sql.rules:# alert udp $HOME_NET any -> $EXTERNAL_NET 1434 (msg:"SQL Worm
    propagation attempt OUTBOUND"; flow:to_server; content:"|04|"; depth:1;
    content:"|81 F1 03 01 04 9B 81 F1|"; fast_pattern:only; content:"sock";
    content:"send"; reference:bugtraq,5310; reference:bugtraq,5311; reference:
    cve,2002-0649; reference:nessus,11214; reference:url,vil.nai.com/vil/
    content/v_99992.htm; classtype:misc-attack; sid:2004; rev:13;)

Symantec also pointed out in 2003 that classic examples of malicious LAN scanning often generated ARP requests for sequential addresses. “Another thing to look for is a succession of ARP requests for consecutive addresses from the same host, like this”:⁷⁸

78. “Detecting Network Traffic That May Be Due to RPC Worms,” Symantec, 2011, http://securityresponse.symantec.com/avcenter/venc/data/detecting.traffic.due.to.rpc.worms.html.

11:43:50.435946 arp who-has 169.254.14.115 tell 169.254.56.166
11:43:50.438301 arp who-has 169.254.14.116 tell 169.254.56.166
11:43:50.445362 arp who-has 169.254.14.117 tell 169.254.56.166
11:43:50.460087 arp who-has 169.254.14.118 tell 169.254.56.166
11:43:50.466885 arp who-has 169.254.14.119 tell 169.254.56.166
11:43:50.482358 arp who-has 169.254.14.120 tell 169.254.56.166
11:43:50.484681 arp who-has 169.254.14.121 tell 169.254.56.166
11:43:50.498546 arp who-has 169.254.14.122 tell 169.254.56.166
11:43:50.505680 arp who-has 169.254.14.123 tell 169.254.56.166
11:43:50.514562 arp who-has 169.254.14.124 tell 169.254.56.166
11:43:50.531488 arp who-has 169.254.14.125 tell 169.254.56.166
11:43:50.534873 arp who-has 169.254.14.126 tell 169.254.56.166
11:43:50.546532 arp who-has 169.254.14.127 tell 169.254.56.166
11:43:50.554933 arp who-has 169.254.14.128 tell 169.254.56.166
11:43:50.570009 arp who-has 169.254.14.129 tell 169.254.56.166
11:43:50.577407 arp who-has 169.254.14.130 tell 169.254.56.166
11:43:50.588931 arp who-has 169.254.14.131 tell 169.254.56.166
11:43:50.600770 arp who-has 169.254.14.132 tell 169.254.56.166
11:43:50.606802 arp who-has 169.254.14.133 tell 169.254.56.166

Modern malware often spreads through email, web, and social networking services. Figure 12-5 shows an example of malware spread through blog posts and links to infected web sites.

Similarly, the Waledac worm spread using social engineering techniques. Figure 12-6 shows a spam email generated by the Waledac worm. Clicking on the link causes execution of the W32.Waledac exploitation code.⁸⁰

Figure 12-6 Spam generated by the Waledac worm. Clicking on the link leads to a web site, which triggers the W32.Waledac exploitation code. Image courtesy of Symantec. Reprinted with permission.⁷⁹

79. “julyeml2.PNG (PNG Image, 567×313 pixels).”

80. “julyeml2.PNG (PNG Image, 567×313 pixels),” Symantec, July 3, 2009, http://www.symantec.com/connect/sites/default/files/images/julyeml2.PNG.

These types of propagation can be detected through web proxy and email server filtering and analysis, among other means.

12.2.2 Command-and-Control Communications

Command-and-control (C&C) channels allow remote attackers to manage and update infected systems remotely. They are a double-edged sword for both the malware developer and the network forensic investigators. On the one hand, this functionality allows remote attackers to intelligently adapt the behavior of compromised servers under their control, based on defensive response, environmental needs, or changing goals. On the other hand, command-and-control channels expose information about the malware and compromised system, providing investigators with ongoing mechanisms for tracking down victims and potentially tracing the compromise back to active controllers.

Common vectors for command-and-control channels include:

• HTTP

• Social networking sites (i.e., Twitter, Facebook)

• Peer-to-peer

• IRC

• Cloud computing environments

– Amazon EC&C (Zeus)

– Google’s AppEngine

Malware developers have been devoting significant time and attention to creating robust, authenticated, covert C&C channels in order to minimize risk and maintain control over their territory. While once it was common to see cleartext IRC-controlled botnets (easy to spot and cut off, especially in a networked environment), nowadays malware C&C channels are integrated into web traffic, social networking sites, and other less obvious traffic. C&C channels are also developed to operate using multiple strategies, in case one is cut off, and to allow systems within an internal LAN to receive updates through distributed networks even if they cannot directly contact outside servers.

Forensic investigators can detect and analyze C&C traffic through statistical flow analysis by noting changes in volume, directionality, sources and destinations, and timing of flow patterns. In cases where the traffic is not encrypted or where there are specific markers in packet contents and headers, content-based alerting mechanisms may be deployed. Commonly, investigators filter for DNS and HTTP queries to known “bad” addresses, or unusual domains, but with the rise of fast-flux networks that incorporate compromised servers on legitimate domains, this is becoming less effective.

The Downadup worm is an excellent example that demonstrates the evolution of C&C channel strategies. Recall that initial variants made HTTP requests to a relatively small number of domains, whereas subsequent variants used a larger pool of potential domains that was harder for defenders to predict. Finally, later variants of the worm changed to a UDP-based peer-to-peer C&C system. Figure 12-7 shows a graph produced by SRI International, which displays W32.Downadup.A network activity over a period of 8 hours after infection. Notice the clear spikes in DNS traffic every 3 hours, as well as the patterns of port 80 and port 445 traffic. “We see that activity is confined to three service ports: 53/UDP (DNS), 80/TCP (HTTP) and 445/TCP(SMB),” wrote Porras et al. (SRI International). “The periodic spikes in DNS activity (every 3 hours) correspond to the Downadup rendezvous activity. The peaks are at 500 (not 250) because the Windows host attempts an additional DNS request lookup for <domain>.localdomain when the DNS A query for <domain> fails. The background DNS activity corresponds to repeated lookups for trafficconverter.biz (every 5 minutes).”⁸¹

81. Phillip Porras, Hassen Saidi, and Vinod Yegneswaran, “An Analysis of Conficker C,” SRI International, March 8, 2009, http://mtc.sri.com/Conficker/.

Figure 12-7 A graph of W32.Downadup.A network activity post-infection over an 8-hour period. Image courtesy of SRI International. Reprinted with permission.⁸²

82. Phillip Porras, Hassen Saidi, and Vinod Yegneswaran, “An Analysis of Conficker C.”

Statistical flow analysis techniques can easily be used to identify the W32.Downadup.A network traffic, even if the precise destinations change.

As another example, recall that the Waledac worm communicates with other nodes in the distributed botnet using HTTP requests and responses. Although the C&C content is encrypted, it is still possible for investigators to identify Waledac C&C traffic based on HTTP header content and payload structure. “As you can see, the encrypted message data can be seen in the ‘a’ URI parameter of the POST request, and is followed by the ‘b’ URI parameter,” wrote Gilou Tenebro in his excellent paper on Waledac behavior.⁸³ Figure 12-8 shows the format of an HTTP message used in Waledac’s C&C system, as presented by Tenebro.

83. Gilou Tenebro, “W32.Waledac Threat Analysis,” Symantec, 2009, http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/W32_Waledac.pdf.

Figure 12-8 The format of an HTTP message used in Waledac’s C&C system, as presented in Gilou Tenebro’s excellent Waledac analysis paper. Image courtesy of Symantec. Reprinted with permission.⁸⁴

84. Ibid.

12.2.3 Payload Behavior

The network activity and behavior of malware post-infection varies wildly, depending on the malware desginer’s intent, the environment, and many other factors. Typically, compromised system behavior tends to include:

• SPAM

• DoS

• Pirated software hosting

• Confidential information theft (spyware)

• Scanning for reconnaissance

• Keylogging

Each of these generates a different pattern of network traffic. As we have seen, some malware is designed for stealthy, long-term infiltration (APT), and may lay dormant on the network for months or years until activated. A network forensic investigator’s tactics will similarly vary, depending on the goals of the investigation, method, and timing of malware detection and available resources.

12.3 The Future of Malware and Network Forensics

If there is anything that the history of malware over the past two decades has taught us, it is that malware is evolving and will continue to evolve.

Malware needs to propagate. It needs to communicate, and it needs to achieve the goals for which it was designed. To do this, it needs to send and receive packets across the network. These are constants that we can expect to see well into the future. As the world becomes increasingly networked, it is likely that we will continue to see malware designed to take advantage of new and different methods of communication, data transfer, and human interactions. Network forensics will become even more relevant as the number of mobile devices continues to rise, along with the value of data captured and stored on them.

As long as old tactics continue to remain effective, we will continue to see them in use, especially for nontargeted and less skillful attacks. There are still botnets running with IRC C&C channels. There are still Sub7-Trojaned systems out there (really). However, as older malware fades away into niche environments, expect to see malware based on new strategies emerge and bloom.

The network forensic and defensive security strategies we employed yesterday may not be effective tomorrow—literally. As network forensic investigators, we cannot allow our thinking about how malware operates to become rigid or get in a rut. If you come across network traffic or logs that unexpectedly don’t fit your known model of behaviors, it is probably not the case that your analysis is wrong—instead, you may be seeing something new. New species of malware are discovered every day.

In the biological world, every time we discover a new species, we encounter an adaptation that we didn’t expect. We have to examine the species and its environment in order to understand how it occupies its niche. The same is true for malware. We need to train ourselves to think like exploratory biologists, and stay on the lookout for things we’ve never seen.

Expect the unexpected.

12.4 Case Study: Ann’s Aurora

The Case: Ann Dercover is after SaucyCorp’s Secret Sauce recipe. She’s been trailing the lead developer, Vick Timmes, to figure out how she can remotely access SaucyCorp’s servers. One night, while conducting reconnaissance, she sees him log into his laptop (10.10.10.70) and VPN into SaucyCorp’s headquarters.

Leveraging her connections with international hacking organizations, Ann obtains an exploit for Internet Explorer and launches a client-side spear phishing attack against Vick Timmes. Ann carefully crafts an email to Vick containing tips on how to improve secret sauce recipes and sends it. Seeing an opportunity that could get him that Vice President of Product Development title (and corner office) that he’s been coveting, Vick clicks on the link. Ann is ready to strike . . .

Meanwhile . . . Knowing that he is a high-value target, long ago Vick Timmes set up a traffic monitoring system on his home network. When suspicious activity is discovered relating to Vick’s account at SaucyCorp, he provides investigators with packet captures so they can help him identify a compromise.

Challenge: You are the forensic investigator. Your mission is to:

• Identify the source of the compromise.

• Recover malware from the packet capture and provide it to investigators for further analysis.

Network:

• Vick Timmes’s internal computer: 10.10.10.70

• External host: 10.10.10.10 [Note that for the purposes of this case study, we are treating 10.10.10.70 as a system on “the Internet.” In real life, this is a reserved nonroutable IP address space.]

Evidence: You are provided with one file containing data to analyze:

• evidence-malware.pcap—A packet capture (.pcap) file containing a small amount of network traffic from Vick Timmes’s home network.

12.4.1 Analysis: Intrusion Detection

Since we’re not quite sure what nastiness lurks in this packet capture, let’s begin by running it through an IDS. For this case, we’ll use Snort, configured with extra malware detection rules from SRI International’s Malware Threat Center.⁸⁵

85. “Most Effective Malware-Related Snort Signatures,” SRI International, 2011, http://mtc.sri.com/live_data/signatures/.

$ snort -c /etc/snort/snort.conf -r evidence-malware.pcap
Running in IDS mode

        --== Initializing Snort ==--
Initializing Output Plugins!
Initializing Preprocessors!
Initializing Plug-ins!
Parsing Rules file "/etc/snort/snort.conf"
...
===============================================================================

Snort processed 2554 packets.
===============================================================================

Breakdown by protocol (includes rebuilt packets):
      ETH: 2554       (100.000%)
  ETHdisc: 0          (0.000%)
     VLAN: 0          (0.000%)
     IPV6: 0          (0.000%)
  IP6 EXT: 0          (0.000%)
  IP6opts: 0          (0.000%)
  IP6disc: 0          (0.000%)
      IP4: 2554       (100.000%)
  IP4disc: 0          (0.000%)
    TCP 6: 0          (0.000%)
    UDP 6: 0          (0.000%)
    ICMP6: 0          (0.000%)
  ICMP-IP: 0          (0.000%)
      TCP: 2553       (99.961%)
      UDP: 1          (0.039%)
     ICMP: 0          (0.000%)
  TCPdisc: 0          (0.000%)
  UDPdisc: 0          (0.000%)
  ICMPdis: 0          (0.000%)
     FRAG: 0          (0.000%)
   FRAG 6: 0          (0.000%)
      ARP: 0          (0.000%)
    EAPOL: 0          (0.000%)
  ETHLOOP: 0          (0.000%)
      IPX: 0          (0.000%)
    OTHER: 0          (0.000%)
  DISCARD: 0          (0.000%)
InvChkSum: 0          (0.000%)
   S5 G 1: 0          (0.000%)
   S5 G 2: 0          (0.000%)
    Total: 2554
===============================================================================

Action Stats:
ALERTS: 13
LOGGED: 13
PASSED: 0
===============================================================================

...
HTTP Inspect - encodings (Note: stream-reassembled packets included):
    POST methods:                   0
    GET methods:                    2
    Headers extracted:              2
    Header Cookies extracted:       0
    Post parameters extracted:      0
    Unicode:                        0
    Double unicode:                 0
    Non-ASCII representable:        0
    Base 36:                        0
    Directory traversals:           0
    Extra slashes ("//"):           0
    Self-referencing paths ("./"):  0
    Total packets processed:        1660
===============================================================================

dcerpc2 Preprocessor Statistics
  Total sessions: 0
===============================================================================

===============================================================================

Snort exiting

After running Snort on the packet capture, we check the alerts file, /var/log/alerts. Notably, Snort produced the following alerts, based on one of the Malware Threat Center’s rules:

[**] [1:5001684:99] E3[rb] BotHunter Malware Windows executable (PE) sent
    from remote host [**]
[Priority: 0]
04/28-17:40:00.841061 10.10.10.10:4444 -> 10.10.10.70:1036
TCP TTL:64 TOS:0x0 ID:37696 IpLen:20 DgmLen:1500 DF
***A**** Seq: 0xE31E89E1  Ack: 0x72ACC97B  Win: 0x16D0  TcpLen: 20

...

[**] [1:5001684:99] E3[rb] BotHunter Malware Windows executable (PE) sent
    from remote host [**]
[Priority: 0]
04/28-17:42:03.220699 10.10.10.10:4445 -> 10.10.10.70:1044
TCP TTL:64 TOS:0x0 ID:24030 IpLen:20 DgmLen:1500 DF
***A**** Seq: 0x559CF78D  Ack: 0x75FAD66D  Win: 0x16D0  TcpLen: 20

The rule that fired was as follows:

# -- Egg Download, Inbound: 5347 of 8211, from 12/23 to 05/31
alert tcp $EXTERNAL_NET !20 -> $HOME_NET any (msg:"E3[rb] BotHunter Malware
Windows executable (PE) sent from remote host"; content: "MZ"; content: "
PE|00 00|"; within:250; flow: established; sid:5001684; rev:99;)

This rule searches for the content “MZ” (the magic number of Windows executable files) and then “PE|00 00|” within 250 bytes of the first match. This indicates that the remote host sent a Windows executable (PE) file to a local system.

12.4.2 TCP Conversation: 10.10.10.10:4444–10.10.10.70:1036

Let’s examine the first stream that triggered this alert to see if we can recover the executable file and gain information about its function. First, we list all TCP conversations in the packet capture, as shown below using tshark (the output has been edited to fit on the page). Notice that in this packet capture, there are 10 TCP conversations, all of which are between 10.10.10.10 (remote host) and Vick’s computer, 10.10.10.70. One of these conversations relates to 10.10.10.10:4444 (remote host on TCP port 4444). Judging by the source and destination ports on both systems, this is the conversation that triggered the Snort alert above.

It is also worth noting that 8 of the 10 TCP conversations in this packet capture relate to 10.10.10.10:4445 (remote host on TCP port 4445). There is also one conversation relating to 10.10.10.10:8080 (remote host, TCP port 8080).

$ tshark -q -n -z conv,tcp -r evidence-malware.pcap
================================================================================

TCP Conversations
Filter:<No Filter>

                                     |     <-     |     ->      |
                                     |Frames Bytes|Frames Bytes |
10.10.10.10:4444 <-> 10.10.10.70:1036 424  120920  979  1293203
10.10.10.10:4445 <-> 10.10.10.70:1044 263   26876  664   869359
10.10.10.10:4445 <-> 10.10.10.70:1043  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1042  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1041  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1040  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1039  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1038  15     930   15      900
10.10.10.10:4445 <-> 10.10.10.70:1037  15     930   15      900
10.10.10.10:8080 <-> 10.10.10.70:1035   5     946    8     6463

Using tcpflow, let’s reconstruct and save the first stream which triggered the Snort alert above, 10.10.10.10:4444<->10.10.10.70:1036:

$ tcpflow -vr evidence-malware.pcap 'src host 10.10.10.10 and src port 4444
    and dst host 10.10.10.70 and dst port 1036'
tcpflow[23578]: tcpflow version 0.21 by Jeremy Elson <[email protected]>
tcpflow[23578]: looking for handler for datalink type 1 for interface
    evidence-malware.pcap
tcpflow[23578]: found max FDs to be 16 using OPEN_MAX
tcpflow[23578]: 010.010.010.010.04444-010.010.010.070.01036: new flow
tcpflow[23578]: 010.010.010.010.04444-010.010.010.070.01036: opening new
    output file

12.4.2.1 File Carving: 10.10.10.10:4444–10.10.10.70:1036

Now, let’s use the Foremost file carving tool to see if there are any files of interest in this TCP conversation:

$ foremost -T -i 010.010.010.070.01036-010.010.010.010.04444
$ cat /tmp/output_Fri_Jun__3_11_44_01_2011/audit.txt
Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus
Audit File

Foremost started at Fri Jun 3 11:44:01 2011
Invocation: foremost -T -i 010.010.010.010.04444-010.010.010.070.01036
Output directory: /tmp/output_Fri_Jun__3_11_44_01_2011
Configuration file: /etc/foremost.conf
------------------------------------------------------------------
File: 010.010.010.010.04444-010.010.010.070.01036
Start: Fri Jun  3 11:44:01 2011
Length: 1 MB (1239098 bytes)

Num  Name (bs=512)         Size  File Offset   Comment

0:  00000000.dll       730 KB             4    04/03/2010 04:07:31
Finish: Fri Jun  3 11:44:01 2011

1 FILES EXTRACTED

exe:= 1
------------------------------------------------------------------

Foremost finished at Fri Jun  3 11:44:01 2011

It looks like Foremost carved out an executable file. We take the cryptographic checksums, as shown below:

$ md5sum 00000000.dll
b062cb8344cd3e296d8868fbef289c7c  00000000.dll

$ sha256sum 00000000.dll
14f489f20d7858d2e88fdfffb594a9e5f77f1333c7c479f6d3f1b48096d382fe  00000000.
    dll

Next, let’s run this executable through an antivirus scanner, ClamAV, to see if it is known malware:⁸⁶

86. “Clam AntiVirus,” 2011, http://www.clamav.net/.

$ clamscan 00000000.dll
00000000.dll: OK

----------- SCAN SUMMARY -----------
Known viruses: 970347
Engine version: 0.96.5
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 0.71 MB
Data read: 0.71 MB (ratio 1.01:1)
Time: 4.889 sec (0 m 4 s)

ClamAV did not recognize the file we carved as malware. However, it is possible that a different antivirus scanner might have a signature for it. Another option for the investigator is to upload the executable file to an online service such as VirusTotal, which “analyzes suspicious files and URLs and facilitates the quick detection of viruses, worms, trojans, and all kinds of malware detected by antivirus engines.”⁸⁷ There are benefits and drawbacks to using a service such as VirusTotal. The benefits are that you get free analysis of any file you upload, and can take advantage of collective intelligence gathered from the global malware research community and multiple antivirus vendors. The drawback is that you must provide a copy of your suspicious file to a third party, which may raise issues of security and privacy.

87. “VirusTotal—Free Online Virus, Malware and URL Scanner,” 2011, http://www.virustotal.com/.

For this case, we will upload our suspicious sample to VirusTotal. As you can see from the results below, 31 out of 43 antivirus vendors flagged the file as suspicious. There does not appear to be a clear consensus as to what the suspicious file is, precisely; it was variously identified as “TR/Swrort.A.964,” “Trojan.Generic.4165076,” “UnclassifiedMalware,” and more.

Here are the results from VirusTotal:

File name: 00000000.dll
Submission date: 2011-06-03 17:40:49 (UTC)
Current status: finished
Result: 31/ 43 (72.1%)

Antivirus Version Last Update Result
AhnLab-V3 2011.06.04.00 2011.06.03  Win-Trojan/Xema.variant
AntiVir 7.11.9.3  2011.06.03  TR/Swrort.A.964
Antiy-AVL 2.0.3.7 2011.06.03  -
Avast 4.8.1351.0  2011.06.03  Win32:Hijack-GL
Avast5  5.0.677.0 2011.06.03  Win32:Hijack-GL
AVG 10.0.0.1190 2011.06.03  Generic2_c.ADPV
BitDefender 7.2 2011.06.03  Trojan.Generic.4165076
CAT-QuickHeal 11.00 2011.06.03  Trojan.Swrort.a
ClamAV  0.97.0.0  2011.06.03  -
Commtouch 5.3.2.6 2011.06.03  W32/MalwareS.BHOC
Comodo  8934  2011.06.03  UnclassifiedMalware
DrWeb 5.0.2.03300 2011.06.03  -
Emsisoft  5.1.0.5 2011.06.03  Virus.Win32.Hijack!IK
eSafe 7.0.17.0  2011.06.02  -
eTrust-Vet  36.1.8365 2011.06.03  -
F-Prot  4.6.2.117 2011.06.03  W32/MalwareS.BHOC
F-Secure  9.0.16440.0 2011.06.03  Trojan.Generic.4165076
Fortinet  4.2.257.0 2011.06.03  -
GData 22  2011.06.03  Trojan.Generic.4165076
Ikarus  T3.1.1.104.0  2011.06.03  Virus.Win32.Hijack
Jiangmin  13.0.900  2011.06.01  Trojan/Generic.cfa
K7AntiVirus 9.104.4763  2011.06.03  Riskware
Kaspersky 9.0.0.837 2011.06.03  -
McAfee  5.400.0.1158  2011.06.03  Generic.dx!spo
McAfee-GW-Edition 2010.1D 2011.06.03  Generic.dx!spo
Microsoft 1.6903  2011.06.03  Trojan:Win32/Swrort.A
NOD32 6177  2011.06.03  probably a variant of Win32/Agent.FZFSSSB
Norman  6.07.07 2011.06.03  W32/Suspicious_Gen2.ATQPC
nProtect  2011-06-03.02 2011.06.03  Trojan/W32.Agent.748032.U
Panda 10.0.3.5  2011.06.03  Trj/Downloader.MDWNDARY
PCTools 7.0.3.5 2011.06.03  Trojan.Gen
Prevx 3.0 2011.06.03  -
Rising  23.60.03.09 2011.06.03  -
Sophos  4.66.0  2011.06.03  Troj/Swrort-B
SUPERAntiSpyware  4.40.0.1006 2011.06.03  -
Symantec  20111.1.0.186 2011.06.03  Trojan.Gen
TheHacker 6.7.0.1.215 2011.06.02  -
TrendMicro  9.200.0.1012  2011.06.03  TROJ_GEN.R07C1DI
TrendMicro-HouseCall  9.200.0.1012  2011.06.03  TROJ_GEN.R07C1DI
VBA32 3.12.16.0 2011.06.03  Trojan.Win32.Rozena.gjd
VIPRE 9473  2011.06.03  Trojan.Win32.Generic!BT
ViRobot 2011.6.3.4494 2011.06.03  -
VirusBuster 14.0.66.0 2011.06.03  Trojan.Swrort!CpBH1zymeR4
Additional information
MD5   : b062cb8344cd3e296d8868fbef289c7c
SHA1  : b22683394afda7c3fa1d559169ce479c1fdad4f9
SHA256: 14f489f20d7858d2e88fdfffb594a9e5f77f1333c7c479f6d3f1b48096d382fe
...
File size : 748032 bytes
First seen: 2010-05-22 23:31:53
Last seen : 2011-06-03 17:40:49
TrID:
DOS Executable Generic (100.0%)
...
PEInfo: PE structure information

[[ basic data ]]
entrypointaddress: 0x627C5
timedatestamp....: 0x4BB6BF03 (Sat Apr 03 04:07:31 2010)
machinetype......: 0x14c (I386)

[[ 4 section(s) ]]
name, viradd, virsiz, rawdsiz, ntropy, md5
.text, 0x1000, 0x757E8, 0x75800, 6.68, d691fd422e760657489d7308f452d24a
.rdata, 0x77000, 0x25C6C, 0x25E00, 5.97, 69cdfb2a638cbc03d431ca94f329d42d
.data, 0x9D000, 0x166EC, 0x11400, 5.16, bb850f7bbadd797bf072f6a1c88a5ed4
.reloc, 0xB4000, 0x9AB2, 0x9C00, 5.82, 1722d6b7db1ed23d9b035a3184fca518
...
ThreatExpert:
http://www.threatexpert.com/report.aspx?md5=b062cb8344cd3e296d8868fbef289c7c
...

Notice that at the end of the VirusTotal report, there is another link to a “ThreatExpert” page, as shown in Figure 12-9. According to ThreatExpert, the executable is “A malicious trojan horse or bot that may represent security risk for the compromised system and/or its network environment.”

Figure 12-9 ThreatExpert’s report on the executable file that we carved out.

Certainly worth saving for further analysis . . .

12.4.2.2 Traffic Analysis: 10.10.10.10:4444–10.10.10.70:1036

Now let’s take a closer look at the stream that we carved this file from, to see if we can gain any insight as to its function. Recall that we carved this executable file from the following TCP conversation:

10.10.10.10:4444 <-> 10.10.10.70:1036

Using Wireshark, we can filter on this conversation. In Figure 12-10, we can see that the conversation occurred on April 28, 2010. It began at 17:40:00.577135000 and completed at 17:41:26.898764000. As you can see in Figure 12-10(a), the first three packets show a complete TCP handshake: 10.10.10.70 (Vick’s computer) sends a TCP SYN, the remote server (10.10.10.10) sends a TCP SYN/ACK, and then 10.10.10.70 sends an ACK, completing the handshake.

Figure 12-10 This screenshot shows the TCP conversation between 10.10.10.10:4444 and 10.10.10.70:1036 on April 28, 2010. The top image (a) illustrates that the conversation began at 17:40:00.577135000, while the bottom image (b) shows that it completed at 17:41:26.898764000.

Using Wireshark’s “Protocol Hierarchy Statistics” feature, we can see that the conversation contains only IPv4/TCP traffic and no higher-layer protocols were identified. See Figure 12-11 for details. By default, Wireshark decodes protocols based on known port numbers, so this may be caused by a higher-layer protocol running on an unusual port, or simply a protocol that is not known to Wireshark.

Figure 12-11 Using Wireshark’s “Protocol Hierarchy Statistics” feature, we can see that the conversation contains only IPv4/TCP traffic and no higher-layer protocols were identified.

Figure 12-12 shows a screen capture of Wireshark, which illustrates the executable file as it was transmitted in the payload of a TCP segment. Note that the payload of the TCP segment begins immediately with the magic number “MZ” (0x4D5A), which corresponds with a Microsoft Windows executable file. This suggests that a file was transmitted in the payload of a TCP segment without a higher-layer file transfer protocol. Very unusual . . .

Figure 12-12 A screen capture of Wireshark, which illustrates the executable file as it was transmitted in the payload of a TCP packet.

12.4.3 TCP Conversations: 10.10.10.10:4445

Recall from our earlier list of TCP conversations that 8 out of 10 TCP conversations in the packet capture relate to 10.10.10.10:4445 and 10.10.10.70, with varying ports on Vick’s computer, 10.10.10.70. There was also a second Snort alert, for traffic between 10.10.10.10:4445 and 10.10.10.70:1044. Let’s take a closer look at the IPv4 and TCP traffic statistics relating to the 10.10.10.10:4445 traffic.

For the purposes of this case study, we examine this traffic manually using tools such as tcpdump to provide you with a low-level understanding of the events. In real life, we recommend that you use an automated analysis tool of your choice to gather traffic statistics. A wide variety of options exist; please see the Finalist solutions for the Ann’s Aurora puzzle on ForensicsContest.com for some examples.⁸⁸

88. Sherri Davidoff, Jonathan Ham, and Eric Fulton, “Network Forensics Puzzle Contest: Puzzle #6 Winners,” LMG Security, July 9, 2010, http://forensicscontest.com/2010/07/09/puzzle-6-winners/.

12.4.3.1 Traffic Analysis: 10.10.10.10:4445

In the tcpdump output below, we can see the first 12 seconds of traffic over TCP port 4445. Notice that there is no completed TCP handshake. Instead, Vick’s computer 10.10.10.70 sent TCP SYN packets to the remote system 10.10.10.10:4445, and the remote system responded with a TCP RST each time. As we will see, there was no TCP SYN/ACK from the remote system until 119 TCP RST packets were already sent! This is certainly unusual activity.

$ tcpdump -nn -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
    host 10.10.10.70'
17:40:35.258314 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:35.258390 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553522759, win 0, length 0
17:40:35.594943 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:35.594980 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553522759, win 0, length 0
17:40:36.141827 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.141872 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553522759, win 0, length 0
17:40:36.142471 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.142531 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553800370, win 0, length 0
17:40:36.688700 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.688741 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553800370, win 0, length 0
17:40:37.235554 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.235656 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 553800370, win 0, length 0
17:40:37.236520 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    554100968, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.236546 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 554100969, win 0, length 0
17:40:37.782456 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    554100968, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.782512 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 554100969, win 0, length 0
17:40:38.329315 IP 10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], seq
    554100968, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:38.329388 IP 10.10.10.10.4445 > 10.10.10.70.1037: Flags [R.], seq 0,
    ack 554100969, win 0, length 0
...

The traffic shown above has some interesting characteristics. Notice the pattern of TCP initial sequence number (ISN) changes in the traffic above. For the first three SYN packets, the TCP ISN is “553522758.” The next three packets have the ISN set to “553800369.” Scrolling down, we can see that 10.10.10.70 changes the TCP ISN of its SYN packets after every third packet in the traffic shown. It sends out a TCP SYN packet, receives a RST from the remote server, and resends the SYN packet two more times (with the same response each time) at approximately half-second intervals. After three attempts with the same TCP ISN, 10.10.10.70 changes the TCP ISN and tries again.

Let’s also take a look at the IP IDs of each TCP SYN packet. In the verbose tcpdump output below, we can see that 10.10.10.70 increments the IP ID by one with each packet sent (first it is 359, then 360, 361, 362, etc.):

$ tcpdump -nnv -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
    host 10.10.10.70 and tcp[13] &  0x02 == 0x02'
17:40:35.258314 IP (tos 0x0, ttl 128, id 359, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0x0e0e (correct),
        seq 553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:35.594943 IP (tos 0x0, ttl 128, id 360, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0x0e0e (correct),
        seq 553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.141827 IP (tos 0x0, ttl 128, id 361, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0x0e0e (correct),
        seq 553522758, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.142471 IP (tos 0x0, ttl 128, id 362, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0xd19e (correct),
        seq 553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:36.688700 IP (tos 0x0, ttl 128, id 363, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0xd19e (correct),
        seq 553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.235554 IP (tos 0x0, ttl 128, id 364, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0xd19e (correct),
        seq 553800369, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.236520 IP (tos 0x0, ttl 128, id 365, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0x3b63 (correct),
        seq 554100968, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:40:37.782456 IP (tos 0x0, ttl 128, id 366, offset 0, flags [DF], proto TCP
     (6), length 48)
    10.10.10.70.1037 > 10.10.10.10.4445: Flags [S], cksum 0x3b63 (correct),
        seq 554100968, win 65535, options [mss 1460,nop,nop,sackOK], length 0
...

Finally, let’s see how often the source port changes. Scrolling through the output of tcpdump below, we can see that the source port changes approximately every 10 to 15 seconds until the TCP handshake is established:

$ tcpdump -nnq -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
host 10.10.10.70 and tcp[13] & 0x02 == 0x02'
17:40:35.258314 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:35.594943 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:36.141827 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:36.142471 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:36.688700 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:37.235554 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:37.236520 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:37.782456 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:38.329315 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:38.329973 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:38.876194 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:39.313691 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:39.314346 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:39.860571 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:40.298079 IP 10.10.10.70.1037 > 10.10.10.10.4445: tcp 0
17:40:47.043801 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:47.407410 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:47.954312 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:47.954969 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:48.391806 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:48.938686 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:48.939329 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:49.485544 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:50.032408 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:50.033078 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:50.579291 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:51.016808 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:51.017456 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:51.563684 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:52.001171 IP 10.10.10.70.1038 > 10.10.10.10.4445: tcp 0
17:40:58.774241 IP 10.10.10.70.1039 > 10.10.10.10.4445: tcp 0
...
17:41:10.569276 IP 10.10.10.70.1040 > 10.10.10.10.4445: tcp 0
...
17:41:22.305270 IP 10.10.10.70.1041 > 10.10.10.10.4445: tcp 0
...
17:41:34.189450 IP 10.10.10.70.1042 > 10.10.10.10.4445: tcp 0
...
17:41:46.149972 IP 10.10.10.70.1043 > 10.10.10.10.4445: tcp 0
...
17:41:58.057545 IP 10.10.10.70.1044 > 10.10.10.10.4445: tcp 0
...

At first, all of the packets originating from the remote system, 10.10.10.10:4445, were TCP RST packets. In fact, a total of 119 TCP RST packets were sent from the remote system during the conversation examined:

$ tcpdump -nn -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
host 10.10.10.70 and tcp[13] & 0x14 == 0x14' | wc -l
reading from file evidence-malware.pcap, link-type EN10MB (Ethernet)
119

Finally, at 17:42:02.985580, the remote system 10.10.10.10 responded with a TCP SYN ACK packet:

$ tcpdump -nn -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
    host 10.10.10.70 and tcp[13] & 0x12 == 0x12'
reading from file evidence-malware.pcap, link-type EN10MB (Ethernet)
17:42:02.985580 IP 10.10.10.10.4445 > 10.10.10.70.1044: Flags [S.], seq
    1436350344, ack 1979373165, win 5840, options [mss 1460,nop,nop,sackOK],
    length 0

We can see from the context that at this time a full TCP connection with a completed three-way handshake was finally established:

$ tcpdump -nn -r evidence-malware.pcap 'host 10.10.10.10 and port 4445 and
    host 10.10.10.70'
...
17:42:02.985483 IP 10.10.10.70.1044 > 10.10.10.10.4445: Flags [S], seq
    1979373164, win 65535, options [mss 1460,nop,nop,sackOK], length 0
17:42:02.985580 IP 10.10.10.10.4445 > 10.10.10.70.1044: Flags [S.], seq
    1436350344, ack 1979373165, win 5840, options [mss 1460,nop,nop,sackOK],
    length 0
17:42:02.985870 IP 10.10.10.70.1044 > 10.10.10.10.4445: Flags [.], ack 1, win
     65535, length 0
...

The pattern of traffic that we have just seen is typical of persistent outbound connection attempts, which are often characteristic of infected systems attempting to connect back to a command-and-control channel. Note that Vick’s internal system, 10.10.10.70, tried repeatedly to connect to a port on the remote server 10.10.10.10:4445. Each time, the remote server responded with a TCP RST packet, indicating that the port was closed. However, Vick’s system continued to send TCP SYN packets, attempting to connect to the same port on the remote host, until eventually the remote system accepted the connection and completed a TCP handshake. Obviously, 119 TCP RST responses within just a couple of minutes is unusual activity that might trigger an IDS alert, but what if Vick’s system were configured to attempt to connect only once per hour, or once per day? By adjusting the timing, an attacker could stay under the radar. This type of stealthy, automated, persistent outbound connection activity is characteristic of an APT.

12.4.3.2 File Carving: 10.10.10.10:4445

As shown in Figure 12-13, shortly after completing a TCP handshake, the server 10.10.10.10:4445 sent an executable file to 10.10.10.70:1044. The time that this file was sent, 17:42:03.220699, matches the time on the second Snort alert we saw earlier, “BotHunter Malware Windows executable (PE) sent from remote host.”

Figure 12-13 A screenshot of Wireshark showing that the server 10.10.10.10:4445 sent an executable file to 10.10.10.70:1044. In the Packet List panel, you can see TCP RST responses from the server, followed by a full TCP handshake, after which the executable file was transferred.

Using Wireshark’s “Follow TCP Stream” function, we can reconstruct the TCP conversation and save one side of the traffic (the data sent from 10.10.10.10:4445 to 10.10.10.70:1044). See Figure 12-14 for details.

Figure 12-14 Using Wireshark’s “Follow TCP Stream” function, we can reconstruct the TCP conversation and save one side of the traffic (the data sent from 10.10.10.10:4445 to 10.10.10.70:1044).

Once the TCP conversation is reconstructed and saved, we can carve out any embedded files using the file carving tool Foremost, as shown below:

$ foremost -T -i wireshark-follow-4445
Processing: wireshark-follow-4445
|*|
$ cat /tmp/output_Sun_Jun__5_12_23_07_2011/audit.txt
Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus
Audit File

Foremost started at Sun Jun  5 12:23:07 2011
Invocation: foremost -T -i wireshark-follow-4445
Output directory: /tmp/output_Sun_Jun__5_12_23_07_2011
Configuration file: /etc/foremost.conf
------------------------------------------------------------------
File: wireshark-follow-4445
Start: Sun Jun  5 12:23:07 2011
Length: 813 KB (833133 bytes)

Num  Name (bs=512)         Size  File Offset   Comment

0:  00000000.dll       730 KB            46    04/03/2010 04:07:31
Finish: Sun Jun  5 12:23:07 2011

1 FILES EXTRACTED

exe:= 1
------------------------------------------------------------------

Foremost finished at Sun Jun  5 12:23:07  2011

As we can see from the cryptographic checksums below, this executable file is the same as the one that we carved out earlier from the previous conversation over port 4444:

$ md5sum 00000000.dll
b062cb8344cd3e296d8868fbef289c7c  00000000.dll

$ sha256sum 00000000.dll
14f489f20d7858d2e88fdfffb594a9e5f77f1333c7c479f6d3f1b48096d382fe  00000000.
    dll

Interesting! The same Snort rule was also triggered, providing corroborating evidence that the same file was sent twice.

12.4.4 TCP Conversation: 10.10.10.10:8080–10.10.10.70:1035

Let’s step back and examine the initial traffic between 10.10.10.10 and 10.10.10.70 to see if this context will provide any insight. The very first traffic between these two hosts occurred at 17:39:59.311284, as you can see in Figure 12-15. This appears to be HTTP traffic that Wireshark has dissected for us.

Figure 12-15 A screenshot of Wireshark showing the first frame exchanged between 10.10.10.10 and 10.10.10.70. This appears to be HTTP traffic that Wireshark has dissected for us.

12.4.4.1 HTTP Analysis: 10.10.10.10:8080–10.10.10.70:1035

In the first packet, you can see that Vick’s computer, 10.10.10.70, made an HTTP GET request for http://10.10.10.10:8080/index.php. The lack of a “Referer” header in the HTTP request indicates that Vick did not open the link from a previous web page. Instead, for example, he could have clicked on a link in a standalone email client.

In Figure 12-15, you can also see the client’s User-Agent string, as follows:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

The string “MSIE 6.0” indicates that the web browser in use on 10.10.10.70 was Microsoft Internet Explorer 6.0 (IE6). “Windows NT 5.1” means that the operating system was Windows XP. The Feature Token “SV1” indicates that the browser was “Internet Explorer 6 with enhanced security features.”⁸⁹ By April 2010, Internet Explorer 6 was quite outdated, although still widely used. It was originally released in 2001. By 2011 Microsoft had launched a global campaign in an attempt to convince users to upgrade to a more recent version of Internet Explorer, writing “Friends don’t let friends use Internet Explorer 6.”⁹⁰ Internet Explorer 6 running on Windows XP was known to have many security flaws. In 2010, Secunia reported that IE6 was affected by 151 Secunia advisories and 233 vulnerabilities. Fifteen percent of the advisories were unpatched.⁹¹

89. “Understanding User-Agent Strings,” MSDN, March 2011, http://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx.

90. “Internet Explorer 6 Countdown,” Microsoft, June 30, 2011, http://ie6countdown.com/.

91. “Microsoft Internet Explorer 6.x,” Secunia, 2010, http://secunia.com/advisories/product/11/?task=statistics_2010.

Subsequently, the server 10.10.10.10:8080 returned “text/html” data. As you can see in Figure 12-16, the server’s response appears to contain Javascript with very long, strange variable names and obfuscated code. We can also see from the HTTP headers that the server appears to be running Apache (although this field is set by the server itself and may not always be accurate).

Figure 12-16 A screenshot of Wireshark showing the response from the server 10.10.10.10:8080, which contains Javascript.

Next, Vick’s computer (10.10.10.70:1035) requested:

http://10.10.10.10:8080/index.phpmfKSxSANkeTeNrah.gif

Notice that the HTTP “Referer” header in this request is set to the previous page:

http://10.10.10.10:8080/index.php

This indicates that the long, strangely named image, index.phpmfKSxSANkeTeNrah.gif, was linked to by the previous page, index.php. Figure 12-17 shows a Wireshark screenshot of this HTTP request.

Figure 12-17 A screenshot of Wireshark showing an HTTP request from Vick’s computer to the server 10.10.10.10:8080, for the strange file “index.phpmfKSxSANkeTeNrah.gif.”

Finally, we see that the server 10.10.10.10:8080 responded by sending a very small GIF image, with height and width both set to 1 pixel. The entire GIF image is contained within one TCP segment, as you can see by looking at the Packet Bytes frame in Figure 12-18.

Figure 12-18 A screenshot of Wireshark showing the response from 10.10.10.10:8080, which contains a tiny 1x1 GIF image.

12.4.4.2 File Carving: 10.10.10.10:8080–10.10.10.70:1035

Now let’s use tcpflow to automatically reconstruct and save the payload of the TCP port 8080 conversation that we just analyzed:

$ tcpflow -v -r evidence-malware.pcap 'src host 10.10.10.10 and src port 8080
     and dst host 10.10.10.70 and dst port 1035'
tcpflow[14550]: tcpflow version 0.21 by Jeremy Elson <[email protected]>
tcpflow[14550]: looking for handler for datalink type 1 for interface
    evidence-malware.pcap
tcpflow[14550]: found max FDs to be 16 using OPEN_MAX
tcpflow[14550]: 010.010.010.010.08080-010.010.010.070.01035: new flow
tcpflow[14550]: 010.010.010.010.08080-010.010.010.070.01035: opening new
    output file

We can use Foremost to carve out any files transferred, as shown below:

$ foremost -T -i 010.010.010.010.08080-010.010.010.070.01035
$ cat /tmp/output_Sun_Jun__5_15_37_32_2011/audit.txt
Foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus
Audit File

Foremost started at Sun Jun  5 15:37:32 2011
Invocation: foremost -T -i 010.010.010.010.08080-010.010.010.070.01035
Output directory: /tmp/output_Sun_Jun__5_15_37_32_2011
Configuration file: /etc/foremost.conf
------------------------------------------------------------------
File: 010.010.010.010.08080-010.010.010.070.01035
Start: Sun Jun  5 15:37:32 2011
Length: 5 KB (6019 bytes)

Num      Name (bs=512)         Size      File Offset      Comment

0:      00000011.gif           43 B            5976        (1 x 1)
1:      00000000.htm           5 KB             174
Finish: Sun Jun  5 15:37:32 2011

2 FILES EXTRACTED

gif:= 1
htm:= 1
------------------------------------------------------------------

Foremost finished at Sun Jun  5 15:37:32 2011

Changing into the Foremost output directory, we take the MD5 and SHA256 cryptgraphic checksums of the files carved by Foremost:

$ md5sum gif/00000011.gif
df3e567d6f16d040326c7a0ea29a4f41  gif/00000011.gif

$ md5sum htm/00000000.htm
2351d02163332f722a50c71d587e507c  htm/00000000.htm

$ sha256sum gif/00000011.gif
548f2d6f4d0d820c6c5ffbeffcbd7f0e73193e2932eefe542accc84762deec87  gif
    /00000011.gif

$ sha256sum htm/00000000.htm
5f86eabe5493758269f1bfc5c073053ddd01ca04ff7252627b175d7efc1f4258  htm
    /00000000.htm

Recall that during our manual analysis we saw two types of data sent from the server to the client: text/html data and a GIF file. This correlates nicely with the output of Foremost, which carved one HTML file and one GIF image from the stream. Let’s take a moment to run these files through an antivirus scanner:

$ clamscan -r /tmp/output_Sun_Jun__5_15_37_32_2011
output_Sun_Jun__5_15_37_32_2011/audit.txt: OK
output_Sun_Jun__5_15_37_32_2011/gif/00000011.gif: OK
output_Sun_Jun__5_15_37_32_2011/htm/00000000.htm: Exploit.CVE_2010_0249 FOUND

----------- SCAN SUMMARY -----------
Known viruses: 970830
Engine version: 0.96.5
Scanned directories: 3
Scanned files: 3
Infected files: 1
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 1.00:1)
Time: 4.796 sec (0 m 4 s)

Wow! ClamAV reported that the HTML file contained an exploit, “Exploit.CVE_2010_0249.” According to the National Vulnerability Database, this exploit takes advantage of a vulnerability with the following characteristics:⁹²

92. “Vulnerability Summary for CVE-2010-0249,” National Vulnerability Database (NVD), August 21, 2010, http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2010-0249.

``Use-after-free vulnerability in Microsoft Internet Explorer 6, 6 SP1, 7,
    and 8 on Windows 2000 SP4; Windows XP SP2 and SP3; Windows Server 2003 SP2
    ; Windows Vista Gold, SP1, and SP2; Windows Server 2008 Gold, SP2, and R2;
     and Windows 7 allows remote attackers to execute arbitrary code by
    accessing a pointer associated with a deleted object, related to
    incorrectly initialized memory and improper handling of objects in memory,
     as exploited in the wild in December 2009 and January 2010 during
    Operation Aurora, aka 'HTML Object Memory Corruption Vulnerability.'''

Recall that the “User-Agent” browser string from Vick’s system, 10.10.10.70, indicated that he was running Microsoft Internet Explorer 6 on Windows XP. According to the National Vulnerability Database, this software was affected by CVE-2010-0249.

This vulnerability was famously exploited during the “Operation Aurora” attacks during late 2009 and early 2010. In January 2010, Google publicly announced that it had been the victim of targeted, sophisticated attacks, and that “As part of our investigation we have discovered that at least twenty other large companies from a wide range of businesses—including the Internet, finance, technology, media and chemical sectors—have been similarly targeted.”⁹³ Days later, a copy of the Aurora exploit was reportedly published online and subsequently incorporated into the Metasploit Framework, where it was widely distributed.⁹⁴

93. David Drummond, “A New Approach to China,” The Official Google Blog, January 12, 2010, http://googleblog.blogspot.com/2010/01/new-approach-to-china.html.

94. “Reproducing the ‘Aurora’ IE Exploit,” Metasploit, January 15, 2010, http://blog.metasploit.com/2010/01/reproducing-aurora-ie-exploit.html.

12.4.5 Timeline

Based on our analysis, let’s put together a timeline of events for investigators. As always, the timeline is simply a hypothesis based on the evidence we have gathered.

Here is our timeline of events, which took place on April 28, 2010 (times are in MST):

• 17:39:59.311284—Packet capture begins

• 17:39:59.311284—10.10.10.70 makes an HTTP GET request for http://10.10.10.10:8080/index.php

• 17:39:59.657213—10.10.10.10:8080 responds with obfuscated Javascript, which an antivirus scanner identifies as “Exploit.CVE_2010_0249.” This exploit takes advantage of the same vulnerability exploited as part of the “Operation Aurora” attacks.

• 17:39:59.773396—10.10.10.70 makes an HTTP GET request for http://10.10.10.10:8080/index.phpmfKSxSANkeTeNrah.gif

• 17:39:59.878427—10.10.10.10:8080 responds with a 1×1 GIF image

• 17:40:00.577135—TCP conversation between 10.10.10.10:4444 and 10.10.10.70:1036 begins

• 17:40:00.577502—TCP handshake completed between 10.10.10.10:4444 and 10.10.10.70:1036

• 17:40:00.841061—First IDS alert for conversation 10.10.10.10:4444–10.10.10.70:1036: “BotHunter Malware Windows executable (PE) sent from remote host”

• 17:40:00.841061—Windows executable sent from 10.10.10.10:4444 to 10.10.10.70:1036

• 17:40:35.258314—Repeated, regular connection attempts from 10.10.10.70 to 10.10.10.10:4445 begin. 10.10.10.70 sends TCP SYN packets, while the remote server responds with TCP RST packets.

• 17:41:26.898764—TCP conversation between 10.10.10.10:4444 and 10.10.10.70:1036 ends

• 17:42:02.98587—TCP handshake completed between 10.10.10.70:1044 and 10.10.10.10:4445

• 17:42:03.220699—First IDS alert for conversation 10.10.10.10:4445–10.10.10.70:1044: “BotHunter Malware Windows executable (PE) sent from remote host”

• 17:42:03.220699—Windows executable sent from 10.10.10.10:4445 to 10.10.10.70:1044

• 17:43:17.753022—TCP conversation ends between 10.10.10.70:1044 and 10.10.10.10:4445

• 17:43:17.753022—Packet capture ends

12.4.6 Theory of the Case

Now that we have put together a timeline of events, let’s summarize our theory of the case. Again, this is a working hypothesis strongly supported by the evidence, references, and experience:

• Vick Timmes was running Internet Explorer 6.0 on a Windows XP system.

• He clicked on a link (or otherwise executed code on his computer) that caused his browser to make an HTTP GET request for: http://10.10.10.10:8080/index.php

• The remote server 10.10.10.10:8080 responded with obfuscated Javascript that turned out to contain an exploit for CVE_2010_0249.

• Vick’s system made an HTTP GET request for a 1×1 GIF image from the same remote server, which was subsequently provided by the remote server.

• Vick’s system was vulnerable to the exploit for CVE_2010_0249. The malware was configured to make outbound connection attempts to a remote server.

• The malware connected back to 10.10.10.10:4444 and downloaded a Windows executable file.

• The malware repeatedly attempted to connect back to 10.10.10.10:4445. At first, the remote daemon was not listening, and the remote server responded with only TCP RST segments. After 119 TCP RST responses, the remote server finally responded with a TCP SYN/ACK and established a connection. Once this connection was established, the malware downloaded the same Windows executable file again.

12.4.7 Response to Challenge Questions

Now, let’s answer the investigative questions posed to us at the beginning of the case.

• Identify the source of the compromise.

Judging by the HTTP requests, the source of the malicious code appears to be the remote server 10.10.10.10:8080. This may not be the ultimate origin of the attack; often, malicious hackers compromise unrelated systems and use them to serve malware or act as intermediaries in a botnet. Detailed analysis of the malware may provide more clues regarding the origins and authors of the attack.

• Recover malware from the packet capture and provide it to investigators for further analysis.

We recovered three files for investigators to analyze:

– Suspicious Javascript that, according to antivirus software, contains an exploit for CVE_2010_0249:
MD5 checksum:
2351d02163332f722a50c71d587e507c

SHA 256 checksum:
5f86eabe5493758269f1bfc5c073053ddd01ca04ff7252627b175d7efc1f4258

– A Windows executable file (transmitted twice during the packet capture), which antivirus software identified as suspicious:
MD5 checksum:
b062cb8344cd3e296d8868fbef289c7c

SHA 256 checksum:
14f489f20d7858d2e88fdfffb594a9e5f77f1333c7c479f6d3f1b48096d382fe

– A 1×1 GIF image:
MD5 checksum:
df3e567d6f16d040326c7a0ea29a4f41

SHA 256 checksum:
548f2d6f4d0d820c6c5ffbeffcbd7f0e73193e2932eefe542accc84762deec87

12.4.8 Next Steps

After identifying a likely compromise and recovering malware, what are the appropriate next steps for SaucyCorp investigators? While next steps vary for each situation, here are some possibilities:

• Containment/Eradication: Removing the attacker’s access to the network is usually a top priority. This can be very challenging with sophisticated malware typical of the APT. In some cases, malware can lie dormant for long periods of time before becoming active, making it difficult to identify compromised hosts. Here are a few options for containing the damage and eradicating the threat:

– Rebuild all systems suspected of being infected with malware (after gathering evidence as needed). Note that simply “cleaning” the system with an antivirus scanner may not suffice, especially in cases where zero-day exploits have been deployed. It is safest to reformat and reinstall the affected systems.

– Change all passwords that may have been compromised. This includes any passwords related to Vick Timmes or used on his local computer, including application passwords, operating system passwords, and the local administrator password.

– Configure the perimeter firewall/IDS to alert on traffic characteristic of the malware identified. This includes alerting on attempts to contact the known “bad” remote host, 10.10.10.10, as well as alerting on the persistent connection attempt patterns that we identified. Block suspicious connection attempts.

– Consider using two-factor authentication for VPN access to the SaucyCorp network. Single-factor authentication is risky and leaves the internal network at much higher risk of compromise in the event that a remote host is compromised. Since SaucyCorp has limited control over remote laptops and mobile devices used on outside networks, it is wise to expect that credentials used on these systems are at a high risk of compromise. SaucyCorp can set up two-factor VPN authentication to reduce the risk of an attacker connecting to the VPN with stolen credentials.

• Third-Party Communications: When a remote system is found to be hosting malware, it may be appropriate to notify the ISP, law enforcement, or the owners of the system. The precise method and timing of communications depends on the needs of the investigation, including whether or not the compromise has been publicized, the relevant laws and locations of the systems in question, and whether or not the owner or ISP is likely to be malicious.

• Malware Analysis: We have provided three suspicious files for investigators to analyze. These files may contain more information regarding the malware behavior, purpose, and authors. If SaucyCorp has the resources to conduct malware analysis, a detailed study may help focus containment/eradication efforts, identify other compromised systems, or even faciliate an investigation that could track down the true source of the attack.

• Additional Sources of Evidence: Here are some high-priority potential sources of additional evidence that might be useful in the case:

– VPN logs—When were Vick’s credentials used to connect to the internal network? Were any of these connections not initiated by Vick himself?

– Development server access logs—Since Vick is a developer for SaucyCorp, his credentials may have been used to access the Secret Sauce recipe and other valuable intellectual property. Review server access logs to determine whether there were any suspicious connections.

– Firewall logs—The firewall logs may provide more granularity regarding network activity relating to the incident.

– Hard drives of compromised systems—Forensic analysis of the compromised system hard drive may reveal detailed information about the malware, or at the very least, allow for an inventory of confidential information that may have been compromised.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12. Malware Forensics

Create new playlist

Sign In

Sign Up

Chapter 12. Malware Forensics

12.1 Trends in Malware Evolution

12.1.1 Botnets

12.1.1.1 Early Developments in Distributed Management

12.1.1.2 Early Developments in Full-Featured Control

12.1.1.3 Implications for Network Forensics

12.1.2 Encryption and Obfuscation

12.1.2.1 Early IDS/Antivirus Evasion

12.1.2.2 Modern Web Obfuscation/Encryption

12.1.2.3 Hiding C&C Channels

12.1.2.4 Maintaining Control

12.1.3 Distributed Command-and-Control Systems

12.1.3.1 The Early Days: Internet Relay Chat (IRC)

12.1.3.2 Drawbacks of Centralized C&C

12.1.3.3 Evolution Toward Distributed C&C

12.1.3.4 Advantages of Distributed C&C

12.1.3.5 Peer-to-Peer C&C

12.1.4 Automatic Self-Updates

12.1.4.1 Early Self-Updating Systems

12.1.4.2 Authenticated Updates

12.1.4.3 Going Meta: Updating the Updating System

12.1.4.4 Success and Failure

12.1.5 Metamorphic Network Behavior

12.1.5.1 Multiple Propagation Strategies

12.1.5.2 Variable Daemon Ports

12.1.5.3 Sophisticated Scanning for New Targets

Randomized Scanning

Permutation Scanning

Spoofed Scanning

Distributed Scanning Networks

Dynamic Timing/Volume

12.1.6 Blending Network Activity

12.1.6.1 Storm/Waledac C&C Protocol Evolution

12.1.6.2 Downadup C&C

12.1.6.3 Social Networking Sites

12.1.7 Fast-Flux DNS

12.1.8 Advanced Persistent Threat (APT)

12.1.8.1 Early Usage of the Term “APT”

12.1.8.2 Definition of APT

12.1.8.3 Early Examples of “APT”

12.1.8.4 Evolution of APT

12.2 Network Behavior of Malware

12.2.1 Propagation

12.2.2 Command-and-Control Communications

12.2.3 Payload Behavior

12.3 The Future of Malware and Network Forensics

12.4 Case Study: Ann’s Aurora

12.4.1 Analysis: Intrusion Detection

12.4.2 TCP Conversation: 10.10.10.10:4444–10.10.10.70:1036

12.4.2.1 File Carving: 10.10.10.10:4444–10.10.10.70:1036

12.4.2.2 Traffic Analysis: 10.10.10.10:4444–10.10.10.70:1036

12.4.3 TCP Conversations: 10.10.10.10:4445

12.4.3.1 Traffic Analysis: 10.10.10.10:4445

12.4.3.2 File Carving: 10.10.10.10:4445

12.4.4 TCP Conversation: 10.10.10.10:8080–10.10.10.70:1035

12.4.4.1 HTTP Analysis: 10.10.10.10:8080–10.10.10.70:1035

12.4.4.2 File Carving: 10.10.10.10:8080–10.10.10.70:1035

12.4.5 Timeline

12.4.6 Theory of the Case

12.4.7 Response to Challenge Questions

12.4.8 Next Steps

Table of Contents for
Chapter 12. Malware Forensics