Chapter 8. Event Log Aggregation, Correlation, and Analysis

“They seem to have a fundamental misunderstanding of the Internet: nothing is too trivial.”

—Philip Lisiecki, MIT1

1. Robert J. Sales, “Random Hall residents monitor one of MIT’s most-washed web sites—MIT News Office,” April 14, 1999, http://web.mit.edu/newsoffice/1999/laundry-0414.html.

Application servers, routers, firewalls, network devices, cameras, HVAC systems, and all kinds of other devices generate event logs. Event logs are simply selected records that provide information about the state of the system and/or environment at a given time. Different types of devices generate different types of event logs. Event logs may include information about system access (such as server logins and logouts), startup and shutdown times, errors and problems, or just routine data such as the data-center temperature.

Event logs may be sent from individual devices to a central server, or sent to multiple servers, or stored on the local device and never sent over the network at all. There are an enormous number of event log formats, including proprietary, customized, and publicly standardized formats—a constant challenge for forensic investigators.

Why do network forensic investigators analyze event logs? Here are a few reasons:

• The event logs contain information directly relating to network functions, such as DHCP lease histories or network statistics.

• The event logs include records of network activity, such as remote login histories.

• The event logs have been transmitted over the network and therefore created network activity.

Often, event log analysis blurs the line between traditional hard drive forensics and network forensics. For example, event logs are often stored on the hard drive of a networked server, transmitted over the network, or describe network-based activity. Network investigators typically analyze event logfiles collected from networked devices using the same log analysis tools and techniques as those collected from local devices.

The quality of the results of a network forensic investigation tend to be directly proportional to the amount and granularity of the available logs. The more comprehensive and organized the logging system, the easier it will be to reconstruct past events accurately. This is why it is best to centrally manage and report on logs before an incident! For ongoing incidents, logging configuration can be set up or changed during an ongoing investigation.

In this chapter, we review the sources of network event logs, methods of collection, log aggregation architectures, and discuss pitfalls associated with networked event log aggregation and collection. We provide some examples, but keep in mind that analysis of any one type of log could fill an entire book. For detailed information on a specific type of log, you will need to research publicly available materials, collect vendor documentation, and in some cases conduct your own experiments in a network forensics lab.

8.1 Sources of Logs

The sources of event logs are wide and varied. All kinds of equipment and software can generate them, including:

• Operating systems of servers and workstations, such as Windows, Linux, or UNIX-based operating systems

• Applications, such as web, database, and DNS servers

• Network equipment, such as switches, routers, and firewalls

• Physical devices, such as cameras, access control systems, and HVAC systems

8.1.1 Operating System Logs

Operating system (OS) event logs are among the most common. By default, most operating systems have small amounts of logging enabled. Most OSs, including Windows, Linux, and UNIX-based systems, are capable of maintaining event logs that store records of system events. By default, these logs are not always extensive, but they are usually customizable. Regulations such as HIPAA, as well as actual data breaches, have spurred many companies into collecting workstation and server authentication logs centrally.

OS event logs can include records of:

• Logins/logouts

• Execution of privileged commands

• System startup/shutdown

• Service activities and errors

8.1.1.1 Microsoft Windows Logs

Microsoft Windows NT systems supported event logging beginning in 1993 with Windows NT 3.1. Older systems prior to Windows Vista used the Event Log service to record logs, and the native Event Viewer application to view and filter them. As of Windows Vista (2006), Microsoft redesigned the event logging system for Windows operating systems and replaced the older system with “Windows Eventing 6.0.” The new system is much more helpful for forensic analysts.

Event Log Service and Event Viewer

The Event Log Service and Event Viewer supported three standard sources of logs, defined in Microsoft’s documentation as:2

2. “Eventlog Key (Windows),” Microsoft, 2011, http://msdn.microsoft.com/en-us/library/aa363648(v=VS.85).aspx.

• Application—“Contains events logged by applications. For example, a database application might record a file error. The application developer decides which events to record.”

• Security—“Contains events such as valid and invalid logon attempts, as well as events related to resource use such as creating, opening, or deleting files or other objects. An administrator can start auditing to record events in the security log.”

• System—“Contains events logged by system components, such as the failure of a driver or other system component to load during startup.”

Depending on the version of Windows NT and the services installed, there can be other custom or application-specific logs as well.

Each event log entry includes a header and description. The header contains basic details such as the date and time, log type,3 computer hostname, user, category, and Event ID.4 The “type” can be one of five event types: Information, Warning, Error, Success Audit (Security Log), and Failure Audit (Security Log). The Event ID is a unique number associated with a specific type of event.5

3. “Eventlog Key (Windows).”

4. “How to view and manage event logs in Event Viewer in Windows XP,” Microsoft Support, May 7, 2007, http://support.microsoft.com/kb/308427.

5. Ibid.

Figure 8-1 shows an example of the Event Log Viewer used on Windows XP.

image

Figure 8-1 The Event Viewer on Windows XP. Notice that this screenshot shows a “Successful Logon,” which is a common type of log created by operating systems.

Event logs are stored on a locally mounted hard drive in the location specified within the Eventlog registry key. They are typically stored in a file with the extension “.evt”6 (although text and CSV formats are also natively supported), and limited to a set file size. The system is frequently configured to automatically overwrite older events according to a set time frame as needed to preserve hard drive space.

6. “How to move Event Viewer log files to another location in Windows 2000 and in Windows Server 2003,” Microsoft Support, March 2, 2007, http://support.microsoft.com/kb/315417.

Microsoft Windows systems prior to Windows Vista did not natively support logging to a remote server. As a result, logs were nearly always stored locally (and therefore highly susceptible to corruption, deletion, or modification, especially in the event of system compromise). Over time, several popular third-party logging clients emerged to send logs to a remote server. These included Snare, Lasso, and Ntsyslog.7

7. Dimitri, “How to convert Windows messages to Syslog | LogLogic Community Portal,” July 21, 2008, http://open.loglogic.com/forum/how-convert-windows-messages-syslog.

Windows Eventing 6.0

Windows Eventing 6.0 includes additional types of event logs. There are two general categories: “Windows Logs” and “Applications and Services Logs.” The Applications and Services Logs include information useful for IT professionals and software developers.8 The “Windows Logs” include the same three categories of information logged by prior versions of Windows: Application, Security, and System. In addition, it includes a new “Setup” log, with records relating to application setup, and a “Forwarded-Events” log.

8. “Event Logs,” Microsoft, 2011, http://technet.microsoft.com/en-us/library/cc722404.aspx.

The new “ForwardedEvents” log marks a critical development for network forensic investigators. As of Windows Vista, Microsoft included built-in support for remote logging. Modern Windows clients can send logs to a remote system for collection and further analysis. This must be configured on both the client and server systems. The “ForwardedEvents” log is used to store events collected from remote systems.

The Windows Eventing 6.0 system is designed to log events in an XML format. This can allow investigators to perform highly granular queries using XPath 1.0 expressions in Event Viewer or custom tools.

In order to send logs to a central remote logging repository, Windows Eventing 6.0 relies on Microsoft’s implementation of the Web Services-Management (WS-Managament) open standard, developed by the industry group Distributed Management Task Force (DMTP) to faciliate remote system management.9 The Windows Remote Management (WinRM) service uses the SOAP-based WS-Managament protocol to exchange information between remote systems via HTTP/HTTPS.10

9. DMTF, “Web Services for Management (WS-Management) Specification,” 2010, http://www.dmtf.org/standards/published_documents/DSP0226_1.1.pdf.

10. Otto Helweg, “Quick and Dirty Large Scale Eventing for Windows,” Management Matters, July 8, 2008, http://blogs.technet.com/b/otto/archive/2008/07/08/quick-and-dirty-enterprise-eventing-for-windows.aspx.

Modern versions of Windows, including Windows Vista, Windows 7, and Windows Server 2008, natively include the WinRM service and can be configured to send event logs to a central collector system, or to act as central collector systems themselves. Windows Server 2003 R2 does not natively include the WinRM service, but WinRM can be downloaded and configured on Windows Server 2003 R2 for use as a central collector system or as an event log source. The WinRM service may also be downloaded for Windows XP and Windows Server 2003 clients so that these systems may be configured for use as event log sources.11

11. MSDN, “Windows Event Collector (Windows),” March 10, 2011, http://msdn.microsoft.com/en-us/library/bb427443(v=VS.85).aspx.

You can use the graphical Event Viewer to configure event log subscriptions on the central log collector or the “wecutil” command-line utility.12

12. Mark Minasi et al., Mastering Windows Server 2008 R2 (Indianapolis, IN: Wiley, 2010).

Example: Windows Event Logging

Here are some Security event logs from a Windows XP system (collected using the client software “Snare.”) Notice that these logs include both failed and successful logons from two users, “sam” and “lila,” using the same workstation. Notice also that the system did not include the year in the timestamp! This is very common and can be challenging for forensic analysts who may have to manually wade through logs and sort out the year through filesystem and timeline analysis.

Apr 17 11:49:54 192.168.1.26 MSWinEventLog      1       Security        40
         Fri Apr 17 11:49:54 2009        683     Security        SYSTEM  User
       Success Audit   N-D88E7A700E254 Logon/Logoff            Session
    disconnected from winstation:     User Name: sam     Domain: N-
    D88E7A700E254     Logon ID: (0x0,0x55A21C)     Session Name: RDP-Tcp#2
        Client Name: student-desktop     Client Address: 192.168.1.25
    26
Apr 17 11:49:54 192.168.1.26 MSWinEventLog      1       Security        41
         Fri Apr 17 11:49:54 2009        593     Security        sam     User
       Success Audit   N-D88E7A700E254 Detailed Tracking               A
    process has exited:     Process ID: 2356     Image File Name: C:WINDOWS
    system32wuauclt.exe     User Name: sam     Domain: N-D88E7A700E254
    Logon ID: (0x0,0x55A21C)     27
Apr 17 11:50:18 192.168.1.26 MSWinEventLog      1       Security        43
         Fri Apr 17 11:50:18 2009        680     Security        SYSTEM  User
       Failure Audit   N-D88E7A700E254 Account Logon           Logon attempt
    by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0    Logon account:  lila
    Source Workstation: N-D88E7A700E254    Error Code: 0xC000006A        29
Apr 17 11:50:26 192.168.1.26 MSWinEventLog      1       Security        44
         Fri Apr 17 11:50:18 2009        529     Security        SYSTEM  User
       Failure Audit   N-D88E7A700E254 Logon/Logoff            Logon Failure:
        Reason: Unknown user name or bad password     User Name: lila
    Domain: N-D88E7A700E254     Logon Type: 2     Logon Process: Advapi
    Authentication Package: Negotiate     Workstation Name: N-D88E7A700E254
             30

As another example, the following logs were sent from a Windows 7 system to a central rsyslogd server using the Snare client. Notice that in this case the central rsyslogd server is configured to prepend all received messages with high-precision timestamps, including the year.

2011-04-25T15:19:29-06:00 fox-ws MSWinEventLog#0111#011Security#0112610#011
    Mon Apr 25 15:19:27 2011#0114776#011Microsoft-Windows-Security-Auditing
    #011bob#011N/A#011Success Audit#011fox-ws#011None#011#011The computer
    attempted to validate the credentials for an account.    Authentication
    Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0  Logon Account: bob   Source
     Workstation: FOX-WS  Error Code: 0x0#0112467
2011-04-25T15:19:29-06:00 fox-ws MSWinEventLog#0111#011Security#0112611#011
    Mon Apr 25 15:19:27 2011#0114648#011Microsoft-Windows-Security-Auditing
    #011bob#011N/A#011Success Audit#011fox-ws#011None#011#011A logon was
    attempted using explicit credentials.    Subject:   Security ID:  S-1-5-18
       Account Name:  FOX-WS$   Account Domain:  WORKGROUP   Logon ID:  0x3e7
      Logon GUID:  {00000000-0000-0000-0000-000000000000}    Account Whose
    Credentials Were Used:   Account Name:  bob   Account Domain:  fox-ws
    Logon GUID:  {00000000-0000-0000-0000-000000000000}    Target Server:
    Target Server Name: localhost   Additional Information: localhost
    Process Information:   Process ID:  0xdc8   Process Name:  C:Windows
    System32winlogon.exe    Network Information:   Network Address: 127.0.0.1
       Port:   0    This event is generated when a process attempts to log on
    an account by explicitly specifying that account's credentials.  This most
     commonly occurs in batch-type configurations such as scheduled tasks, or
    when using the RUNAS command.#0112468
2011-04-25T15:19:29-06:00 fox-ws MSWinEventLog#0111#011Security#0112612#011
    Mon Apr 25 15:19:27 2011#0114624#011Microsoft-Windows-Security-Auditing
    #011bob#011N/A#011Success Audit#011fox-ws#011None#011#011An account was
    successfully logged on.    Subject:   Security ID:  S-1-5-18   Account
    Name:  FOX-WS$   Account Domain:  WORKGROUP   Logon ID:  0x3e7    Logon
    Type:   2    New Logon:   Security ID:  S
    -1-5-21-29357171-1333843320-2140510157-1002   Account Name:  bob   Account
     Domain:  fox-ws   Logon ID:  0x77710f   Logon GUID:
    {00000000-0000-0000-0000-000000000000}    Process Information:   Process
    ID:  0xdc8   Process Name:  C:WindowsSystem32winlogon.exe    Network
    Information:   Workstation Name: FOX-WS   Source Network Address:
    127.0.0.1   Source Port:  0    Detailed Authentication Information:
    Logon Process:  User32    Authentication Package: Negotiate   Transited
    Services: -   Package Name (NTLM only): -   Key Length:  0    This event
    is generated when a logon session is created. It is generated on the
    computer that was accessed.    The subject fields indicate the account on
    the local system that requested the logon. This is most commonly a
    service such as the Server service, or a local process such as Winlogon.
    exe or Services.exe.    The logon type field indicates the kind of logon
    that occurred. The most common types are 2 (interactive) and 3 (network).
        The New Logon fields indicate the account for whom the new logon was
    created (i.e., the account that was logged on).    The network fields
    indicate where a remote logon request originated. Workstation name is not
    always available and may be left blank in some cases.    The
    authentication information fields provide detailed information about this
    specific logon request.   - Logon GUID is a unique identifier that can be
    used to correlate this event with a KDC event.   - Transited services
    indicate which intermediate services have participated in this logon
    request.   - Package name indicates which subprotocol was used among the
    NTLM protocols.   - Key length

8.1.1.2 UNIX/Linux Event Logging

UNIX-based and Linux systems, such as Solaris, Ubuntu Linux, and Mac OS X, are distributed with a native logging utility based on “syslog.”

Syslog

Syslog is a client/server protocol designed for transmitting event notifications in an IP network. It was originally developed in the 1980s, although it wasn’t formally documented by a standards body until 2001 (IETF RFC 3164). Syslog (and its derivatives) is the default logging mechanism in most modern Linux/UNIX-based distributions. By default, the syslog service runs locally and allows system administrators to configure logging of local operating system and application data. Syslog can also be configured to receive event logs from other systems over a network socket (the default syslog configuration file is usually located at /etc/syslog.conf). Many applications include built-in functionality for interfacing with both local and remote syslog daemons. There are also several popular Windows clients that collect and forward logs to central syslog servers. For remote logging, the standard syslog port is UDP 514. Because syslog receives over UDP by default, the transmissions do not have network-layer reliability, and packets may be dropped without recovery. There is no built-in encryption or authentication.

Facilities in syslog are different categories for messages. Logs are sent to a particular facility based (roughly) on their origin process. For example, messages created by a mail application are normally sent to the mail facility. These assignments are fully customizable by the administrator, and the facilities local0 through local7 are reserved for local customizations. Common syslog facilities include auth, authpriv, cron, daemon, kern, lpr, mail, mark, news, syslog, user, uucp, and local0 through local7.

Priorities indicate the severity or importance of the message. Syslog message priorities include: debug, info, notice, warning, warn (same as warning), err, error (same as err), crit, alert, emerg, and panic (same as emerg). You can configure syslog to store messages of specific facilities/priorities in different files. (By default, logs on UNIX/Linux systems are commonly stored in /var/log.)13

13. Greg Wettstein and Martin Schulze, “Linux man page,” 2011, http://linux.die.net/man/5/syslog.conf.

syslog-ng

Syslog-ng is the “next generation” syslog daemon. It includes a number of additional features, such as built-in support for encryption during transmission (TLS), enabling administrators to send messages over encrypted tunnels to the syslog-ng server. Syslog-ng also enables administrators to configure reliable transport-layer communication using TCP. There is an open-source version (syslog-ng OSE) licensed under LGPL, and also additional plugins available under a proprietary license (“Premium Edition” [PE]).14

14. BalaBit—IT Security, “Multiplatform Syslog Server and Logging Daemon,” 2011, http://www.balabit.com/network-security/syslog-ng/.

The syslog-ng configuration (typically found at /etc/syslog-ng/syslog-ng.conf) is based on the same concepts as syslog, but it allows for greater granularity. The concepts of facilities and priorities are still included. In the syslog-ng configuration file, message routes are determined by three components: source, destination, and filter.

• First, the user defines a source. This can be a file, UDP port, TCP port, or socket.

• Next, the user defines a destination. Again, syslog-ng can write messages to a file and send them out via TCP or UDP port, socket, or a user’s terminal.

• Then, the user defines a filter that specifies a combination of facilities and ports that will be logged.

• Finally, the user creates a “log” statement and includes a source, filter, and destination. This log statement is a rule by which syslog-ng routes message.15

15. Jose Pedro Oliverira, “Linux man page,” 2004, http://linux.die.net/man/5/syslog-ng.conf.

Here is a simple example of the syslog-ng configuration file. In this example, the server is configured to accept remote authentication logs (UDP 514) and write them to a file (/var/log/remote.auth.log).

#Define remote message sources
source s_remote {  udp(); };
#Define destination filename
destination df_auth_remote { file("/var/log/remote.auth.log"); };
# Filter all messages from the auth and authpriv facilities
filter f_auth { facility(auth, authpriv); };
# Put it all together
log {
          source(s_remote);
  filter(f_auth);
          destination(df_auth_remote);
};

The final “log” statement instructs syslog-ng to take all logs from the source “s_remote,” sort out all auth and authpriv facility logs, and write these to the file /var/log/remote.auth.log.

rsyslogd

Rsyslog, the “reliable and extended syslogd,”16 is an open-source (GPL) and very popular replacement for the original syslog daemon.17 Among many features, rsyslog includes built-in support for:

16. Rainer Gerhards and Michael Meckelein, “Ubuntu Manpage: rsyslogd—reliable and extended syslogd,” Ubuntu Manuals, 2010, http://manpages.ubuntu.com/manpages/hardy/man8/rsyslogd.8.html.

17. Rainer Gerhards, “Rainer’s Blog: Why does the world need another syslogd? (aka rsyslog vs. syslog-ng),” August 12, 2007, http://blog.gerhards.net/2007/08/why-does-world-need-another-syslogd.html.

• IPv6

• TCP and Reliable Event Logging Protocol (RELP) for reliable transport-layer transmission

• TLS/SSL for encrypted transmission of logs over the network

• Extremely granular control over output log format

• Config file, which is backward-compatible with syslogd

• High-precision timestamps and time zone logging

• ISO 8601, an international standard for communication of dates and times,18 and RFC 3339, an IETF standard for compliance with ISO 860119

18. “International Organization for Standardization,” Wikipedia, July 13, 2011, http://en.wikipedia.org/wiki/ISO_8601.

19. G. Klyne and C. Newman, “RFC 3339—Date and Time on the Internet: Timestamps,” IETF, July 2002, http://www.ietf.org/rfc/rfc3339.txt.

Practically speaking, these features are very useful for forensic analysts because they facilitate analysis, allow for highly granular logs, enable accurate timestamping and time synchronization, and provide for confidential and reliable transmission of logs across an IP network.20

20. Rainer Gerhards, “rsyslog vs. syslog-ng—a comparison,” rsyslog, May 6, 2008, http://www.rsyslog.com/doc/rsyslog_ng_comparison.html.

Example: Linux Authentication Logs

In the example below, you can see authentication logs from an Ubuntu Linux server (8.04) running syslogd in default configuration. Notice that the example logs below contain records of privileged commands (“sudo” and “su”), successful remote logins by the user “marie” (“sshd”), and records of automatic activities run by the administrative “root” user (“CRON” jobs). Again, notice that the year was not included by default in the system timestamp.

Feb 27 15:39:28 bigserver sudo:    marie : TTY=pts/0 ; PWD=/home/marie ; USER
    =root ; COMMAND=/bin/su
Feb 27 15:39:28 bigserver sudo: pam_unix(sudo:session): session opened for
    user root by marie(uid=0)
Feb 27 15:39:28 bigserver sudo: pam_unix(sudo:session): session closed for
    user root
Feb 27 15:39:28 bigserver su[19070]: Successful su for root by root
Feb 27 15:39:28 bigserver su[19070]: + pts/0 root:root
Feb 27 15:39:28 bigserver su[19070]: pam_unix(su:session): session opened for
     user root by marie(uid=0)
Feb 27 16:09:01 bigserver CRON[19107]: pam_unix(cron:session): session opened
     for user root by (uid=0)
Feb 27 16:09:01 bigserver CRON[19107]: pam_unix(cron:session): session closed
     for user root
Feb 27 16:17:01 bigserver CRON[19118]: pam_unix(cron:session): session opened
     for user root by (uid=0)
Feb 27 16:17:01 bigserver CRON[19118]: pam_unix(cron:session): session closed
     for user root
Feb 27 16:30:01 bigserver CRON[19121]: pam_unix(cron:session): session opened
     for user root by (uid=0)
Feb 27 20:02:11 bigserver sshd[19224]: Accepted publickey for marie from
    10.146.28.43 port 38760 ssh2
Feb 27 20:02:11 bigserver sshd[19226]: pam_unix(sshd:session): session opened
     for user marie by (uid=0)
Feb 27 20:02:26 bigserver sudo: pam_unix(sudo:auth): authentication failure;
    logname=marie uid=0 euid=0 tty=/dev/pts/0 ruser= rhost=  user=marie
Feb 27 20:02:35 bigserver sudo:    marie : TTY=pts/0 ; PWD=/home/marie ; USER
    =root ; COMMAND=/bin/echo hi

Example: Linux Kernel Logs

The example below shows a different kind of operating system log from an Ubuntu Linux server (9.10) running sysklogd. The “kernel” logs below were created as a result of a system reboot. The server logged an extensive amount of information, from startup/shutdown times, to detailed CPU/RAM information, to networking and filesystem data (a brief snippet is shown below).

Feb 23 18:42:50 littleserver kernel: Kernel logging (proc) stopped.
Feb 23 18:42:50 littleserver kernel: Kernel log daemon terminating.
Feb 23 18:42:51 littleserver exiting on signal 15
Feb 23 20:53:20 littleserver syslogd 1.5.0#5ubuntu4: restart.
Feb 23 20:53:20 littleserver kernel: Inspecting /boot/System.map-2.6.31-22-
    server
Feb 23 20:53:20 littleserver kernel: Cannot find map file.
Feb 23 20:53:20 littleserver kernel: Loaded 64702 symbols from 47 modules.
Feb 23 20:53:20 littleserver kernel: [    0.000000] Initializing cgroup
    subsys cpuset
Feb 23 20:53:20 littleserver kernel: [    0.000000] Initializing cgroup
    subsys cpu
Feb 23 20:53:20 littleserver kernel: [    0.000000] Linux version 2.6.31-22-
    server (buildd@allspice) (gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9) ) #65-
    Ubuntu SMP Thu Sep 16 16:33:54 UTC 2010 (Ubuntu 2.6.31-22.65-server)
Feb 23 20:53:20 littleserver kernel: [    0.000000] Command line: root=/dev/
    md0 ro quiet splash
Feb 23 20:53:20 littleserver kernel: [    0.000000] KERNEL supported cpus:
Feb 23 20:53:20 littleserver kernel: [    0.000000]   Intel GenuineIntel
Feb 23 20:53:20 littleserver kernel: [    0.000000]   AMD AuthenticAMD
Feb 23 20:53:20 littleserver kernel: [    0.000000]   Centaur CentaurHauls
Feb 23 20:53:20 littleserver kernel: [    0.000000] BIOS-provided physical
    RAM map:
Feb 23 20:53:20 littleserver kernel: [    0.000000]  BIOS-e820:
    0000000000000000 - 000000000009fc00 (usable)
...
Feb 23 20:53:20 littleserver kernel: [    0.139714] CPU0: Intel(R) Core(TM)2
    Quad  CPU   Q8200  @ 2.33GHz stepping 07
Feb 23 20:53:20 littleserver kernel: [    0.140000] Booting processor 1 APIC
    0x1 ip 0x6000
Feb 23 20:53:20 littleserver kernel: [    0.010000] Initializing CPU#1
Feb 23 20:53:20 littleserver kernel: [    0.010000] Calibrating delay using
    timer specific routine.. 4607.26 BogoMIPS (lpj=23036301)
Feb 23 20:53:20 littleserver kernel: [    0.010000] CPU: L1 I cache: 32K, L1
    D cache: 32K
Feb 23 20:53:20 littleserver kernel: [    0.010000] CPU: L2 cache: 2048K

Subseqently, on February 24, the same system also recorded logs relating to the restart of the logging service itself (syslog), as well as routine time markers that syslog includes so that analysts can verify the logging server is working (even though there were no events).

Feb 24 06:25:17 littleserver syslogd 1.5.0#5ubuntu4: restart.
Feb 24 06:53:20 littleserver -- MARK --
Feb 24 07:13:20 littleserver -- MARK --
Feb 24 07:33:20 littleserver -- MARK --

8.1.2 Application Logs

Most applications generate logs of important events, including records of network-based access, debugging messages, and routine startup/shutdown logs. Application servers include:

• Web servers

• Database servers

• Mail servers

• DNS servers

• VoIP servers

• Firewalls

• Logging servers

• Authentication servers

• Filesharing servers

• . . . and much more.

Application servers are constantly evolving new features and changing output formats to keep up with the latest developments in hardware, software, and protocol specifications. As a result, application log contents and formats are also constantly changing. Although some applications create logs in well-documented, published formats, network forensic analysts constantly run into new log formats, or formats that are simply not well documented. In some cases, application logs in real life are just different from the documented format or outdated. Furthermore, many applications allow local system administrators to customize log contents and formatting, in which case there may be no documentation at all as to the meaning of the output (other than what is listed in the application configuration).

8.1.2.1 Example—SMTP Logs

Here is an example of logs from a Postfix/SMTPD mail server. The logs begin by recording a message that was sent from “[email protected]” on the local system to “[email protected].” You can see that it was successfully sent through Google’s mail server, 209.85.222.47.

Subsequently, an unknown system (201.250.45.83, registered at the time to the ISP “Telefonica de Argentina”) unsuccessfully attempted to connect to the SMTP daemon on “bigserver” and relay a message from “[email protected]” to “[email protected].” (It is likely that this was an attempt to send a SPAM message.)

Sep 23 14:33:45 bigserver postfix/pickup[25480]: 9612218C069: uid=1001 from=<
    [email protected]>
Sep 23 14:33:45 bigserver postfix/cleanup[26011]: 9612218C069: message-id
    =<20090923203345.9612218C069@[email protected]>
Sep 23 14:33:45 bigserver postfix/qmgr[24702]: 9612218C069: from=<
    [email protected]>, size=1060, nrcpt=1 (queue active)
Sep 23 14:33:45 bigserver postfix/pickup[25480]: A501018C062: uid=1001 from=<
    [email protected]>
Sep 23 14:33:45 bigserver postfix/cleanup[26011]: A501018C062: message-id
    =<20090923203345.A501018C062@[email protected]>
Sep 23 14:33:45 bigserver postfix/qmgr[24702]: A501018C062: from=<
    [email protected]>, size=1064, nrcpt=1 (queue active)
Sep 23 14:33:45 bigserver postfix/pickup[25480]: B3A3718C06B: uid=1001 from=<
    [email protected]>
Sep 23 14:33:45 bigserver postfix/cleanup[26011]: B3A3718C06B: message-id
    =<20090923203345.B3A3718C06B@[email protected]>
Sep 23 14:33:45 bigserver postfix/qmgr[24702]: B3A3718C06B: from=<
    [email protected]>, size=1084, nrcpt=1 (queue active)
Sep 23 14:33:46 bigserver postfix/smtp[26064]: B3A3718C06B: to=<
    [email protected]>, relay=aspmx.l.google.com[209.85.222.47]:25, delay
    =0.57, delays=0.04/0.01/0.14/0
.37, dsn=2.0.0, status=sent (250 2.0.0 OK 1253738039 13si3213003pzk.59)
Sep 23 14:33:46 bigserver postfix/qmgr[24702]: B3A3718C06B: removed
Sep 23 15:19:55 bigserver postfix/smtpd[26160]: warning: 201.250.45.83:
    hostname 201-250-45-83.speedy.com.ar verification failed: Name or service
    not known
Sep 23 15:19:55 bigserver postfix/smtpd[26160]: connect from unknown
    [201.250.45.83]
Sep 23 15:19:56 bigserver postfix/smtpd[26160]: NOQUEUE: reject: RCPT from
    unknown[201.250.45.83]: 554 5.7.1 <[email protected]>: Relay access
    denied; from=<[email protected]> to=<[email protected]> proto=
    SMTP helo=<none>
Sep 23 15:19:57 bigserver postfix/smtpd[26160]: disconnect from unknown
    [201.250.45.83]
Sep 23 15:23:17 bigserver postfix/anvil[26163]: statistics: max connection
    rate 1/60s for (smtp:201.250.45.83) at Sep 23 15:19:55
Sep 23 15:23:17 bigserver postfix/anvil[26163]: statistics: max connection
    count 1 for (smtp:201.250.45.83) at Sep 23 15:19:55
Sep 23 15:23:17 bigserver postfix/anvil[26163]: statistics: max cache size 1
    at Sep 23 15:19:55
Sep 23 15:19:57 bigserver postfix/smtpd[26160]: disconnect from unknown
    [201.250.45.83]

Below is a snippet of the mail server’s error log, which contains error messages generated by the Postfix service. These include records caused by mistakes in the local “mail” command usage, permission errors, and more.

Sep 20 21:53:09 bigserver postfix/sendmail[10815]: fatal: usage: sendmail [
    options]
Sep 20 22:27:48 bigserver postfix/sendmail[10961]: fatal: Recipient addresses
     must be specified on the command line or via the -t option
Sep 20 22:27:48 bigserver postfix/sendmail[10963]: fatal: Recipient addresses
     must be specified on the command line or via the -t option
Sep 20 22:28:29 bigserver postfix/sendmail[10979]: fatal: Recipient addresses
     must be specified on the command line or via the -t option
Sep 22 13:04:31 bigserver postfix/sendmail[24424]: fatal: usage: sendmail [
    options]
Sep 22 15:32:07 bigserver postfix/postmap[25785]: fatal: open database /etc/
    postfix/generic.db: Permission denied
Sep 22 15:55:40 bigserver postfix/postmap[26209]: fatal: open database /etc/
    postfix/virtual.db: Permission denied
Sep 22 17:01:33 bigserver postfix[27072]: error: to submit mail, use the
    Postfix sendmail command
Sep 22 17:01:33 bigserver postfix[27072]: fatal: the postfix command is
    reserved for the superuser

8.1.3 Physical Device Logs

Many types of physical devices can be connected to the network for the purposes of monitoring, logging, and/or control. Devices include:

• Cameras

• Access control systems, such as RFID readers on doors

• HVAC systems

• Uninterruptible power supplies (UPSs)

• Intensive care units at hospitals

• Electrical systems

• Laundry machines21

21. Kevin Der, “Laundry Monitoring to Go Online for All Dormitories,” The Tech, March 7, 2006, http://tech.mit.edu/V126/N9/9laundrytext.html.

• Bathrooms22

22. Riad Wahby, “Random Hall Bathroom Server,” 2001, http://bathroom.mit.edu/.


As the Internet emerged in the mid-1990s, a student at MIT, Philip Lisiecki, got tired of having to walk all the way to the basement of his dormitory in order to check to see if a laundry machine was available. He decided to use photoresistors to monitor the laundry machines’ indicator lights, and then rigged up the system to send data across the dormitory’s old phone wiring.

“Once it was all running, everyone liked it.” Mr. Lisiecki commented. “I could tell people used it since every time I turned my machine off for a half hour, someone with a laundry basket would wander by my room to find out what was wrong.”23

23. Robert J. Sales, “Random Hall residents monitor one of MIT’s most-washed web sites—MIT News Office,” April 14, 1999, http://web.mit.edu/newsoffice/1999/laundry-0414.html.

Ultimately, laundry events were collected on a central server, laundry.mit.edu, and accessible over the World Wide Web. In 1999, the university newspaper ran a story on the system with the following report:

Shortly after the laundry server was created, housemaster Nina Davis-Millis, an MIT information technology librarian, suggested that it be included in a New York Public Library exhibit on innovative uses of the Internet. Her friend, who was organizing the exhibit, included it in a proposal for the exhibit.

“Her superiors were heartily displeased with her,” said Ms. Davis-Millis. “They told her that she was too gullible, that she apparently was not familiar with the noble MIT tradition of hacking, but that it ought to have been obvious to her that hooking washers and dryers to the Internet was impossible.” Thus, on the grounds that it couldn’t be done, Random Hall’s Internet laundry connection was not included in the NYPL Internet exhibit.

To which Mr. Lisiecki replies, “They seem to have a fundamental misunderstanding of the Internet: nothing is too trivial.”24

24. Ibid.


8.1.3.1 Example—Camera Logs

Below is an example of surveillance logs for an Axis camera system, generated by Zoneminder, an open-source, Linux-based “video camera security and surveillance solution” (http://www.zoneminder.com). The log sample below was kindly provided by Dr. Johannes Ullrich of the SANS Institute, who explained that the software “compares images and sends the alerts whenever the image comparison shows motion in the field of view.”

Feb 27 04:04:49 enterpriseb zma_m7[5628]: INF [frontaxis: 86496 - Gone into
    alarm state]
Feb 27 04:04:50 enterpriseb zma_m7[5628]: INF [frontaxis: 86498 - Gone into
    alert state]
Feb 27 04:04:50 enterpriseb zma_m7[5628]: INF [frontaxis: 86499 - Gone back
    into alarm state]
Feb 27 04:04:50 enterpriseb zma_m3[5648]: INF [AxisPTZ: 91951 - Gone into
    alarm state]
Feb 27 04:04:51 enterpriseb zma_m3[5648]: INF [AxisPTZ: 91952 - Gone into
    alert state]
Feb 27 04:04:51 enterpriseb zma_m7[5628]: INF [frontaxis: 86501 - Gone into
    alert state]
Feb 27 04:05:23 enterpriseb zma_m3[5648]: INF [AxisPTZ: 91986 - Gone into
    alarm state]
Feb 27 04:05:24 enterpriseb zma_m7[5628]: INF [frontaxis: 86535 - Gone into
    alarm state]
Feb 27 04:05:25 enterpriseb zma_m7[5628]: INF [frontaxis: 86536 - Gone into
    alert state]
Feb 27 04:05:25 enterpriseb zma_m3[5648]: INF [AxisPTZ: 91992 - Gone into
    alert state]

8.1.3.2 Example—Uninterruptible Power Supply Logs

Since power failures can have catastrophic impacts on network availability, network administrators naturally want to control and monitor UPS systems remotely. Apcupsd is a mature, open-source package for controlling and monitoring APC-brand UPS systems.25 It is supported on a wide variety of platforms, including UNIX and Linux-based systems, as well as most popular versions of Microsoft Windows.26

25. “APC Product Information for Uninterruptible Power Supply (UPS),” 2011, http://www.apc.com/products/category.cfm?id=13.

26. Adam Kropelin and Kern Sibbald, “APCUPSD User Manual,” APC UPS Daemon, January 16, 2010, http://www.apcupsd.com/manual/manual.html.

Below is an example of UPS logs generated by apcupsd. Many thanks to Dr. Johannes Ullrich for providing these sample logs.

Feb 13 03:26:22 enterpriseb apcupsd[2704]: Power failure.
Feb 13 03:26:25 enterpriseb apcupsd[2704]: Power is back. UPS running on
    mains.
Feb  2 13:52:09 enterpriseb apcupsd[2704]: Communications with UPS lost.
Feb  2 13:52:16 enterpriseb apcupsd[2704]: Communications with UPS restored.
Jan 29 23:30:28 enterpriseb apcupsd[2704]: Power failure.
Jan 29 23:30:31 enterpriseb apcupsd[2704]: Power is back. UPS running on
    mains.
Jan 13 09:08:51 enterpriseb apcupsd[2704]: Power failure.
Jan 13 09:08:55 enterpriseb apcupsd[2704]: Power is back. UPS running on
    mains.
Dec 30 17:16:32 enterpriseb apcupsd[2704]: Power failure.
Dec 30 17:16:35 enterpriseb apcupsd[2704]: Power is back. UPS running on
    mains.

8.1.4 Network Equipment Logs

Enterprise-class network equipment can generate extensive event logs. Often these logs are designed to be sent to a remote server via syslog or SNMP because the network devices themselves have very limited storage capacity.

Network equipment can include, among other things:

• Firewalls

• Switches

• Routers

• Wireless access points

8.1.4.1 Example—Apple Airport Extreme Logs

Below is an example of event logs downloaded from an Apple Airport Extreme. Notice that these logs include association and dissassociation events, authentication logs, and records of accepted connections. Once again, the logs do not include a year.

Apr 17 13:01:29 Severity:5      Associated with station 00:16:eb:ba:db:01
Apr 17 13:01:29 Severity:5      Disassociated with station 00:16:eb:ba:db:01
Apr 17 13:01:29 Severity:1      WPA handshake failed with STA 00:16:eb:ba:db
    :01 likely due to bad password from client
Apr 17 13:01:29 Severity:5      Deauthenticating with station 00:16:eb:ba:db
    :01 (reserved 2).
Apr 17 13:01:30 Severity:5      Associated with station 00:16:eb:ba:db:01
Apr 17 13:01:30 Severity:5      Disassociated with station 00:16:eb:ba:db:01
Apr 17 13:01:31 Severity:5      Associated with station 00:16:eb:ba:db:01
Apr 17 13:01:34 Severity:5      Associated with station 00:16:eb:ba:db:01
Apr 17 13:01:34 Severity:5      Installed unicast CCMP key for supplicant
    00:16:eb:ba:db:01
Apr 17 13:13:01 Severity:5      Disassociated with station 00:16:cb:08:27:ce
Apr 17 13:13:01 Severity:5      Rotated CCMP group key.
Apr 17 13:40:03 Severity:5      Associated with station 00:16:cb:08:27:ce
Apr 17 13:40:03 Severity:5      Installed unicast CCMP key for supplicant
    00:16:cb:08:27:ce
Apr 17 13:40:43 Severity:5      Connection accepted from [fe80::216:cbff:fe08
    :27ce%bridge0]:51161.
Apr 17 13:40:45 Severity:5      Connection accepted from [fe80::216:cbff:fe08
    :27ce%bridge0]:51162.
Apr 17 13:40:45 Severity:5      Connection accepted from [fe80::216:cbff:fe08
    :27ce%bridge0]:51163.
Apr 17 13:49:18 Severity:5      Clock synchronized to network time server
    time.apple.com (adjusted +0 seconds).
Apr 17 13:57:13 Severity:5      Rotated CCMP group key.

For more details on network equipment logs, please see Chapter 9, “Switches, Routers, and Firewalls,” and Chapter 6, “Wireless: Network Forensics Unplugged.”

8.2 Network Log Architecture

The forensic quality of retained logs, and the strategies and methods for obtaining them, are strongly influenced by the environment’s network log architecture. Disparate logs accumulated on a fleet of systems don’t really help an enterprise security staff understand the “big picture” of what is happening on the network. Distributed logs also make it difficult for security staff to audit the past history of security-related events. Even worse for the investigator, it can become a nightmare to locate and obtain important evidence.

The answer to this problem is to centralize event logging in such a way that all events of interest are aggregated and can be correlated between multiple sources. It may not be the case that the target environment is instrumented in such a way, but we’ll discuss ways that this can be achieved, either by IT staff in advance or on-the-fly to facilitate an investigation.

8.2.1 Three Types of Logging Architectures

There are essentially three types of log architectures: local, remote decentralized, and centralized.

8.2.1.1 Local

Logs are collected on individual local hard drives. This is extremely common because it is the default configuration for most operating systems, applications, physical devices, and network equipment. However, local log aggregation presents issues for forensic applications, such as:

• Collecting logs from different systems can be a lot of work. In some cases, log collection causes modification of the local system under investigation, which is certainly not desirable.

• Logs stored locally on a compromised or potentially compromised system may be modified or deleted. Even if there is no evidence to indicate modification, logs stored on compromised systems cannot be trusted.

• Time skew on disparate local systems is often significant, and can make it very difficult to correlate logs and create valid timelines.

• Typically, logs stored on local systems are not centrally configured, and the output formats may vary between systems (or may only include sparse, default log data).

• Only a limited amount of logs may be stored to conserve local disk space.

8.2.1.2 Remote Decentralized

Logs are sent to different remote storage systems throughout the network. Different types of logs may be stored on different servers. This is commonly seen in environments where there is decentralized management of IT resources, such as in universities where individual departments or labs manage their own small groups of servers.

• Remote storage of logs increases their forensic value. When logs are sent to a remote system, they are far less likely to be affected by a local system compromise (at the very least, they cannot be altered or modified after they are sent, unless the logging server is compromised as well).

• Time skew can be partially mitigated by having the logging servers timestamp incoming logs, although time skew between servers may still be an issue.

• Collecting logs from a logging server is usually far less work than collecting logs from endpoint devices, especially since the logging server is more likely to be under direct administrative control. That said, collecting logs from different log servers may still require substantial effort and coordination between teams.

• Sending logs to a remote server across the network introduces new challenges. Namely, reliability is a primary concern. If there is a network outage, logs may be dropped and lost forever. Security is also a concern; when transmitted in cleartext, as is most common, an attacker on the local network may be able to intercept, read, and perhaps even modify logs in transit. These issues can be addressed through the use of protocols that provide support for reliability such as TCP or RELP and encryption protocols such as TLS. However, configuring support for security features can be cumbersome, and network administrators in decentralized environments often do not have the resources to address these issues.

8.2.1.3 Centralized

Logs are centralized and aggregated on a central log server or a group of synchronized, centrally managed log servers. For the purposes of network forensics, a centralized logging infrastructure is typically the most desirable, for the following reasons:

• Logs are stored on a remote server, where they are not subject to modification or deletion in the event of an endpoint device compromise.

• Time skew can be addressed by stamping incoming logs as they arrive. Furthermore, when logging configuration is centralized, endpoint devices can be configured to maintain synchronized time and include granular time information in log output (so long as the endpoint device software supports these features).

• Centralized management typically allows for easy access to log data, and also facilitates on-the-fly configuration changes when needed to support an ongoing investigation.

• Issues of reliability and security of logs in transit can be centrally addressed. Network administrators can configure support for TCP, RELP, TLS, and other security features in central logging servers and centrally controlled clients.

• Aggregated logs can be easily analyzed using centralized log aggregation and analysis tools. (Please see Section 8.2.3 for details.)

As discussed previously, many network devices do not have sufficient storage capacity to maintain extensive forensic data. Fortunately, most network devices and conventional servers can be configured to send logs to a remote server that can aggregate forensic data from many sources. Central logging servers are simply servers configured to receive and store logs sent by other systems. They often store logs from many sources, including routers, firewalls, switches, and other servers. This helps system administrators keep tabs on many systems, and it enables investigators to find a wealth of data in one place.

The evidence stored on a central logging server varies greatly, depending on what systems were sending logs to it. Typically, you will find logs from many servers and workstation operating systems that were previously sent to the central logging server for storage and analysis. It is also common to find firewall logs, which include dates, times, source, destination, and protocols of the packets being logged.

8.2.2 Remote Logging: Common Pitfalls and Strategies

Automated remote logging is generally considered best practice in the log management industry. However, from a forensic perspective, there are potential pitfalls to keep in mind, and ways that investigators can compensate.

When event logs are sent across the network to a central server, they are placed at risk of loss or modification in transit. In addition, forensic investigators must consider issues such as time skew and confidentiality of the event logs in transit. Here is a brief discussion of major factors to consider when remote event logging is employed in a network forensic investigation, including reliability, time skew, confidentiality, and integrity.

8.2.2.1 Reliability

Can logs be lost as they are transmitted across the network? Frequently the answer is “yes.” For example, clients that rely on the traditional syslog daemon to send logs across the network must rely on UDP as a transport-layer protocol. UDP is a connectionless protocol that does not include support for reliable transport. When a syslog message is transmitted across the network via UDP, if the datagram is dropped in transit, the server will have no record of it and the client will not know to retransmit. UDP datagrams are also commonly dropped when the receiving application is overloaded due to a high volume of traffic.

For forensic investigators, reliability of event log communication is an important issue. With unreliable event logging architectures, it is possible for an attacker to execute a denial-of-service attack or initiate a network outage in order to prevent critical information from being logged on a central server. Accidental loss is also a problem. While investigators may be able to piece together a timeline of events from existing logs, if there is a chance that critical details are missing, the investigation may fail or the case may fall apart in court.

To address the issue of reliability, offshoots of the syslog daemon have added native support for transport of syslog messages over TCP. TCP is a connection-oriented protocol with built-in support for reliability, so if a packet is dropped in transit, the server will notice a missing sequence number or the client will not receive an acknowledgment of transmission and will resend.

Although TCP improves reliability at the transport layer, there are still higher-layer issues. Rainer Gerhards, author of rsyslog, has published a nice article where he discusses how local buffering of TCP packets on the client system can lead to dropped syslog messages in the event of a network or server outage.27 To address this issue, he developed the lightweight RELP,28 which is designed to ensure reliable transfer of syslog messages at a higher layer.

27. Rainer Gerhards, “Rainer’s Blog: On the (un)reliability of plain tcp syslog . . . ,” April 2, 2008, http://blog.gerhards.net/2008/04/on-unreliability-of-plain-tcp-syslog.html.

28. Rainer Gerhards, “RELP—The Reliable Event Logging Protocol (Specification),” March 19, 2008, http://www.librelp.com/relp.html.

8.2.2.2 Time Skew

Time skew between endpoint systems is one of the biggest challenges for forensic investigators. It is difficult, if not impossible, to correlate logs between endpoint systems when local clock times (and therefore event log timestamps) are off. Even when the time skew between systems can be determined for a specific point in time, the clock on an endpoint system may have been running slower or faster at different points.

The best way to manage this problem is to synchronize clocks on all systems using NTP or a similar system. This can prevent problems due to clock skew during subsequent log analysis. Not all devices support time synchronization, however. Another option is for the central event logging server to add a timestamp to logs as they arrive. While this can be useful, it does not take into account network transit time; there is always a delay between the time that logs are generated on the endpoint system and the time that the logs are received by a remote logging server.

Logging output formats may not include enough information to properly correlate timestamps between different systems. For example, as we have seen, often the year is not included by default in event logging output. Furthermore, the time zone is also typically not included by default, which can make it very difficult for investigators to correlate logs between systems located in geographically dispersed areas. When configuring log output formats for potential forensic use, make sure to include complete, high-precision timestamps with time zone information.

8.2.2.3 Confidentiality

You might not expect that maintaining the confidentiality of event logs is important, but event logs can reveal extensive amounts of information about user habits, system software and directories, security issues, and more (this is why they are so highly valuable for forensics!). Anyone with access to the LAN (wired or wireless) or a device on the network path may be able to capture and analyze the traffic. To maintain the confidentiality of event logs in transit, use a protocol such as TLS/SSL that ensures the data is encrypted as it is transmitted across the network.

8.2.2.4 Integrity

Ensuring the integrity of event logs in transit is extremely important. By default, most remote logging utilities do not provide any assurance of integrity. Event logs transmitted over UDP or TCP without higher-layer encryption may be intercepted and modified in transit. Even worse, an attacker could inject fake event logs into the network traffic. This is quite easy to do for many types of remote logging servers, such as traditional syslog servers listening on a UDP port.

Fortunately, many event logging architectures now support TLS/SSL, either natively or through the use of tunneling proxies such as stunnel. You can use TLS/SSL to protect the data in transit and mutually authenticate the server and client event logging systems.

8.2.3 Log Aggregation and Analysis Tools

There are many tools available to facilitate log aggregation on central systems. Log aggregation tools typically work in a client-server model. Typically, an agent is installed on the endpoint system (or, in some cases, a native tool may be able to export logs). A compatible central logging server is set up to listen on the network and receive logs as they are transmitted. Often, the central logging server software also includes powerful analysis capabilities.

Common agents installed on endpoints include:

• Syslog (and derivative) daemons, as previously discussed.

• System iNtrusion Analysis and Reporting Environment (SNARE)29—An open-source agent for Windows, Linux, Solaris, and more.

29. “Snare—Audit Log and EventLog analysis,” 2011, http://www.intersectalliance.com/projects/index.html.

Central aggregation and analysis software includes:

• Splunk30—Log monitoring, reporting, and search tool.

30. “Splunk | Operational Intelligence, Log Management, Application Management, Security and Compliance,” 2011, http://www.splunk.com.

• System Center Operations Manager (SCOM), formerly Microsoft Operations Manager (MOM)31—A monitoring and log aggregation product designed for Windows systems.

31. “System Center Operations Manager,” Wikipedia, June 23, 2011, http://en.wikipedia.org/wiki/Microsoft_Operations_Manager.

• Distributed log Aggregation for Data analysis (DAD)—An open-source log aggregation and analysis tool released under GPL.32 (Figure 8-2 is a screenshot of the open-source DAD log analysis tool).

32. D. Hoelzer, “DAD,” SourceForge, June 29, 2011, http://sourceforge.net/projects/lassie/.

image

Figure 8-2 A screenshot of the DAD open-source log aggregation and analysis tool. Image courtesy of D. Hoelzer. Reprinted with permission.36

• Cisco’s Monitoring, Analysis and Response System (MARS)33—Security monitoring for network devices and hosts (including Windows, Linux, and UNIX).

33. “Cisco Security Monitoring, Analysis, and Response System,” Wikipedia, October 19, 2010, http://en.wikipedia.org/wiki/Cisco_Security_Monitoring,_Analysis,_and_Response_System.

• ArcSight34—Commercial third-party log management and compliance solutions.

34. “ArcSight,” Wikipedia, July 14, 2011, http://en.wikipedia.org/wiki/ArcSight.

8.2.3.1 Splunk

Splunk is a proprietary, portable, highly extensible log aggregation and analysis tool. Figure 8-3 shows an example of Splunk. We’ll revisit Splunk several times throughout this book because it’s inexpensive (free for individual use up to 500 MB/day), versatile, scalable, and popular.

image

Figure 8-3 A simple example showing SSH service authentication logs in Splunk.

Splunk has a web-based interface and a database on the back end. It can accept input in a variety of forms, from reading a flat file to directly receiving syslog data over the network. Once Splunk has processed the data, you can run searches and reports.35

35. “Splunk | Operational Intelligence, Log Management, Application Management, Security and Compliance,” 2011, http://www.splunk.com.

36. “dbimage.php (JPEG Image, 640×463 pixels),” http://sourceforge.net/dbimage.php?id=92531.

8.3 Collecting and Analyzing Evidence

Since the topic of network forensics relating to event logs is so broad, we’ll use this as an opportunity to review and reinforce our network forensics methodology, OSCAR.

8.3.1 Obtain Information

When collecting and analyzing event logs, here is some specific information you may need to obtain:

Sources of Event Logs Identify sources of event logs that are likely to relate to your investigation. You can accomplish this by conducting interviews with key personnel, reviewing network architecture documents, and reading IT policies and procedures that pertain to the environment under investigation. You will want to answer questions such as:

– What event logs exist?

– Where are they stored?

– What are my technical options for accessing them?

– Who controls the event logs?

– How do we go about getting permission and access to collect them?

– How forensically sound are the event logs?

– Do the targeted systems have the capacity for additional logging to be configured?

Resources Identify the resources you have available for event log collection, aggregation, and analysis. This includes equipment, communications capacity, time, money, and staff. For example, if you only have a 1TB hard drive for event log evidence storage, but there are 20TB of logs on the central logging server under investigation, you will either need to purchase more storage space or select a subset of logs to gather. Similarly, if you must collect the logs remotely but the network latency is high, this can limit the amount of data you are able to transfer in the time you have available. Questions to consider include:

– How much storage space do I have available?

– How much time do I have for collection and analysis?

– What tools, systems, and staff are available for collection and analysis?

Sensitivity For network-based investigations in particular, you have to consider how the sources of evidence and network itself will be impacted by evidence collection. Some equipment, such as routers and firewalls, may be under heavy load and operating close to processor/memory/bandwidth capacity. Retrieving evidence from these systems may cause network or equipment slowness or outages, depending on the chosen method of collection. You will need to answer questions such as:

– How critical are the systems that store the event logs?

– Can they be removed from the network?

– Can they be powered off?

– Can they be accessed remotely?

– Would copying logs from these systems have a detrimental impact on equipment or network performance? If so, can we minimize the impact by collecting evidence at specific times or by scheduling downtime?

8.3.2 Strategize

In most enterprises, there are so many sources of event logs that taking the time to strategize is crucial. Otherwise, you may find that you run out of time or hard drive space before you have gathered the most important evidence, or you may overlook a valuable source of information.

As part of the “strategize” phase, review the information you’ve obtained, list and prioritize sources of evidence, plan the acquisition, and communicate with your team and enterprise staff.

8.3.2.1 Review Information

Once you’ve finished obtaining information, take the time to review all the information you have regarding the investigation. This may include:

• Goals and time frame of the investigation (very important!). It is worth reviewing your goals regularly during the investigation so that you can maintain perspective and stay on track.

• Potential sources of evidence.

• Resources available to you, such as hard drives for storing copies of event logs, secure storage space, staff, forensics workstations, and time.

• Sensitivity of networks and equipment that may be affected.

8.3.2.2 Prioritize Sources of Evidence

Acquiring evidence is expensive—literally. Every byte of data you copy takes time to transfer and uses up hard drive space. If you’re acquiring evidence over a network, copying log files can use up a large amount of bandwidth and slow down the network. Furthermore, the more evidence you acquire, the more data you have to sift through later during the analysis phase.

In any organization, there are likely to be an overwhelming number of possible sources of event logs, including workstations, servers, switches, routers, firewalls, NIDS/NIPS, access control systems, web proxies, and more. Usually, only a small percentage of these logs contain evidence relevant to your investigation. In order to use your resources efficiently, review the list of possible sources of evidence and identify those that are likely to be of the highest value to you.

Next, consider how much effort is required to obtain each source of evidence. When logs are centralized, it is usually fairly straightforward to gather copies of them. However, when logs are distributed on a variety of systems—such as hundreds of workstations or application servers managed by different departments—then technical or political hurdles can dramatically slow down the process. It is important to take these factors into consideration and anticipate challenges so that you can plan and budget accordingly.

After you’ve decided which sources of evidence are the most important, and estimated the resources required to obtain them, prioritize your evidence collection so that you can realize the greatest value from your efforts.

8.3.2.3 Plan Acquisition

In order to actually obtain copies of event logs, you will likely need to work with system administrators that manage the equipment on which the event logs reside. Before you actually set foot onsite to acquire the evidence, work with your primary contact to determine who can best provide you with access to the evidence. Then, plan your method for acquisition. Will you have physical access to the system, or will you acquire evidence remotely? When and where will you acquire the evidence? The time of day may be especially important if the investigation must remain secret, or if the equipment that stores the evidence is under heavy load at certain hours.

8.3.2.4 Communicate

No investigator is an island. Once you have developed a plan (usually in conjunction with your investigative team and local contacts), make sure to communicate the final plan to everyone involved. Agree on a method and times for regular communication and updates, such as daily emails or weekly conference calls.

8.3.3 Collect Evidence

The method you use for collecting event log evidence will vary depending on the environment’s event logging architecture, your sources of evidence, and your available resources (among other factors). Potential methods include physical connection, manual remote connection, central log aggregation, and passive evidence acquisition.

8.3.3.1 Physical Connection

For logs stored locally on endpoint devices, you may choose to create a bit-for-bit forensic image of the physical storage media (such as a hard drive), and extract event log files directly from it using traditional hard drive forensic techniques. The benefits of this method are that you can retain an exact copy of the drive for later presentation in court (if necessary), and that from a forensics perspective, there are widely accepted standards for the process of forensic hard drive analysis.

However, if the event logs of interest are stored on more than a few endpoint systems, it may be simply impractical to invest in the time and equipment necessary to forensically image multiple drives. Another major drawback is that logs stored locally are at higher risk of modification in the event of system compromise, and as a result are often considered less forensically valuable than logs stored on remote systems.

For logs stored on a central logging server, it is sometimes appropriate to take a bit-for-bit forensic image of the logging server’s hard drive. Again, this has the benefit of allowing a forensic copy of the server’s drive to be preserved and presented later. It can also allow for a very detailed analysis of logging server configuration. Supplemental information such as precise versions of event logging software can be helpful for later analysis.

Commonly, network forensic investigators simply copy the logfiles off either an endpoint system or a central logging server using a physical port (i.e., eSATA or USB). This has the strong advantage of having a relatively low impact on system resources (i.e., copying files takes far less time, storage space, and I/O than making a bit-for-bit forensic duplicate of the drive). In addition, the system does not need to be taken offline or powered down in order to copy files. If you use this method, make sure to capture cryptographic checksums of the source and destination files to ensure that you have made an accurate duplicate.

Physical collection of event logs is also useful when you want to minimize the network footprint of the investigation.

8.3.3.2 Manual Remote Connection

You may prefer to collect logs through manual remote examination of endpoint devices using services such as SSH, RDP, or an administrative web page. The benefits of this method are that it may enable you to examine systems that are geographically farther away than you could access otherwise, and it may also enable you to collect logs directly from many more sources than you could otherwise.

One drawback of manual remote collection is that you will modify the system under examination simply by accessing it remotely (it is even possible to cause log rollover simply by logging into the device, if the logging system has reached a preset limitation on storage space). You will create network activity through the process of manual remote examination, which can also contribute to network congestion. Make sure you are aware of bandwidth and throughput limitations before transferring large quantities of event logs across the network.

8.3.3.3 Central Log Aggregation

If you are lucky, the event logs are already being sent to a central logging server (or a synchronized group of central logging servers). In this case, you will want to begin by researching the underlying log collection architecture to ensure that it is forensically sound and will meet your needs for evidence collection. For example, you should know the transport-layer protocol in use for log transmission, as well as mechanisms for authentication of logging client and server and encryption of data in transit, to determine the risk of event log loss or modification.

You can access the evidence on a central logging server in multiple ways, depending on how it is set up:

Console Log onto the central logging server using SSH, RDP, or direct console connection, depending on the specific configuration. Browse files, copy specific logs for later analysis, burn them onto a CD, or simply view them.

Web interface Many organizations use a log analysis tool such as Splunk, which facilitates centralized log analysis. Often, these include helpful web interfaces, with search and report-generating capabilities that can be extremely useful for identifying suspicious activity and correlating logs.

Proprietary interface Some logging servers are accessed using proprietary client software, which provide graphical analysis/report capabilities.

In certain situations, you may choose to take a forensic image of the central logging server’s hard drive(s). This can be very resource-intensive. See “Physical Collection,” above, for details.

8.3.3.4 Passive Evidence Acquisition

In some cases, you may want to collect event logs as they are transmitted across the network through passive evidence acquisition techniques (please see Chapter 3, “Evidence Acquisition,” for details). This is effective in environments where you have access to the network segments over which the event log data is transmitted, and when the log data is not encrypted in transit (or in the rare situation where you have the ability to decrypt the log data in transit). Passive evidence acquisition may be your best option for event log collection in an environment where the IT staff are either unaware of your investigation or uncooperative.

8.3.4 Analyze

Strategies for conducting event log analysis are as varied as the sources of event logs themselves and the goals of specific investigations. For discussions of event log analysis relating to specific types of logs, please see Chapter 10, “Web Proxies”; Chapter 9, “Switches, Routers, and Firewalls”; and Chapter 7, “Network Intrusion Detection and Analysis.”

General techniques include:

• Dirty Values—Searching for specific keywords in logs.

• Filtering—Narrowing down your search space by selecting logs based on time, source/destination, content, or other factors.

• Activity Patterns—Analyzing logs for patterns of activity and identifying suspicious activity based on the results.

• Fingerprinting—Creating a catalog of complex patterns and correlating these with specific activities to facilitate later analysis.

Figure 8-3 shows an example of analysis using Splunk. In this case, we have searched for all logs containing the word “sshd.” This effectively filters the logs so that they only include information relating to the SSH remote login service. We can see the results graphically represented, and can click on any time to view the logs in detail. You can see that there were seven results for our search at 10:51 AM on Friday, April 17 2009. These logs appear to be attempts to SSH into the account “student” on the server “ids.” At first the SSH attempts failed, but at 10:51:33 there was a successful login to “student” from 192.168.1.10.

Based on these results, our next step might be to examine the patterns of activity specifically relating to the “student” account on any system. Perhaps the “student” account was compromised through a password-guessing attack—or perhaps the user had simply forgotten the password temporarily. We could also examine all logs relating to the “ids” system to see if there was any further evidence of suspicious behavior.

Analysis tools are not perfect! Notice that Splunk listed a year (2009) in Figure 8-3. However, there is no year in the original syslog event logs—just a month, day, and time. Analysis tools can sometimes produce unexpected or incorrect results. Whenever possible, correlate events using multiple sources of evidence, and confirm findings by checking original evidence.

8.3.5 Report

Event logs are frequently used as the basis for conclusions drawn in reports. Here are a few good tips for incorporating evidence from event logs into your forensic reports:

• A picture is worth a thousand words. It is always a good idea to include graphical representations of event log analysis when you have the option. Charts and graphs generated by Splunk and similar tools can be very powerful.

• Make sure to include detailed information regarding your sources of event logs and your process for collecting them. Generally this is appropriate for an appendix of the report or supplemental materials.

• Remember to include information regarding your methodology and the analysis tools you used. This is especially important because analysis tools are not perfect. The more widely known and tested your tools, the more likely they are to be accepted in a courtroom setting.

• Always retain and reference your original sources of evidence so that you can support your reported findings.

8.4 Conclusion

Event logs are some of the most valuable sources of evidence for forensic investigators, particularly when they are stored on a secure central server and can be correlated with multiple log sources. Application servers, firewalls, access control systems, network devices, and many other types of equipment generate event logs and are often capable of exporting them to a remote log server for aggregation.

It is important for the forensic investigator to be aware of common pitfalls associated with event log analysis, including incorrect or incomplete timestamps, questions of reliability and integrity, and confidentiality. With these in mind, event logs are an important source of evidence, and can be analyzed with a variety of command-line or visual tools.

8.5 Case Study: L0ne Sh4rk’s Revenge


The Case: Inspired by Mr. X’s successful exploits at the Arctic Nuclear Fusion Research Facility, L0ne Sh4rk decides to try the same strategy against a target of his own: Bob’s Dry Cleaners! The local franchise destroyed one of his favorite suits last year and he has decided it is payback time. Plus, they have a lot of credit card numbers.

Meanwhile . . . Unfortunately for L0ne Sh4rk, Bob’s Dry Cleaners is on the alert, having been attacked by unhappy customers before. Security staff notice a sudden burst of failed login attempts to their SSH server in the DMZ (10.30.30.20), beginning at 18:56:50 on April 27, 2011. They decide to investigate.

Challenge: You are the forensic investigator. Your mission is to:

• Evaluate whether the failed login attempts were indicative of a deliberate attack. If so, identify the source and the target(s).

• Determine whether any systems were compromised. If so, describe the extent of the compromise.

Bob’s Dry Cleaners keeps credit card numbers and personal contact information for their Platinum Dry Cleaning customers (many of whom are executives). They need to make sure that this credit card data remains secure. If you find evidence of a compromise, provide an analysis of the risk that confidential information was stolen. Be sure to carefully justify your conclusions.

Network: Bob’s Dry Cleaners network consists of three segments:

• Internal network: 192.168.30.0/24

• DMZ: 10.30.30.0/24

• The “Internet”: 172.30.1.0/24 [Note that for the purposes of this case study, we are treating the 172.30.1.0/24 subnet as “the Internet.” In real life, this is a reserved nonroutable IP address space.]

Evidence: Security staff at Bob’s Dry Cleaners collect operating system logs from servers and workstations, as well as firewall logs. These are automatically sent over the network from each system to a central log collection server running rsyslogd (192.168.30.30). Security staff have provided you with log files from the time period in question. These log files include:

auth.log—System authentication and privileged command logs from Linux servers

workstations.log—Logs from Windows workstations

firewall.log—Cisco ASA firewall logs

Security staff also provide you with a list of important systems on the internal network:

image

8.5.1 Analysis: First Steps

Let’s begin by examining the logs relating to the failed login attempts. Based on reports from security staff, we know that the activity began at 18:56:50 and targeted 10.30.30.20, which corresponds with the hostname “baboon-srv.” Since this is a Linux server, let’s browse for corresponding logs in the auth.log evidence file. The first failed login attempts we see are as follows:

2011-04-26T18:56:50-06:00 baboon-srv sshd[6423]: pam_unix(sshd:auth):
    authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost
    =172.30.1.77  user=root
2011-04-26T18:56:53-06:00 baboon-srv sshd[6423]: Failed password for root
    from 172.30.1.77 port 60372 ssh2
2011-04-26T18:56:56-06:00 baboon-srv sshd[6423]: last message repeated 2
    times
2011-04-26T18:56:56-06:00 baboon-srv sshd[6423]: PAM 2 more authentication
    failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=172.30.1.77  user=
    root

From these records, we see that the remote host 172.30.1.77 attempted to login to the SSH server on baboon-srv targeting the account “root.” The “root” account is the default administrative user on most Linux/UNIX systems. This is a very common target for brute-force attacks, and a failed remote login attempt is certainly suspicious.

8.5.2 Visualizing Failed Login Attempts

Note that each initial “authentication failure” log is followed by additional entries that indicate that there were two more failed login attempts. It’s important to remember that failed login attempts are not recorded individually, but are instead recorded as a series of event logs in the pattern above.

Next, let’s use a visualization tool to get a better picture of the volume and time frame of the failed login attempts. Figure 8-4 is a screenshot of Splunk showing all activity from auth.log from the host “baboon-srv.” As you can see, the bulk of the activity occurred between 18:56 and 19:05.

image

Figure 8-4 A chart in Splunk showing all activity from auth.log relating to “baboon-srv.” The bulk of the activity occurs between 18:56 and 19:05.

After importing our log files into Splunk, we can use regular expressions to define specific fields in the logs that are of interest to us, such as a field named “auth_rhost,” which specifies the source of the remote login attempt (see “rhost=” in the SSH event log). Zooming in on our time frame of interest, we can select each field, filter on it, and view statistics. Figure 8-5 shows remote SSH login attempts between 18:56 and 19:06, with the auth_rhost field selected. As you can see, only one remote host attempted to login to baboon-srv, and that was 172.30.1.77.

image

Figure 8-5 A screenshot of Splunk showing remote SSH login attempts between 18:56 and 19:06, with the auth rhost field selected. There is only one remote host attempting to login to baboon, and that is 172.30.1.77.

Drilling down even further, we see that the login attempts have a distinct, regular pattern. Figure 8-6 shows a closeup of SSH remote login attempts during just one minute (18:57:00–18:57:59). As you can see, there are two events logged approximately every six seconds, with only slight variation. The corresponding events, shown below the chart, are a record of one failed remote login attempt, followed by a record of two more failed remote login attempts (these are the only event logs that contain the “auth_rhost” field, which we have filtered on). This means there are a total of three failed login attempts every six seconds, for an average of one login attempt every two seconds.

image

Figure 8-6 A screenshot of Splunk showing SSH remote login attempts during just one minute (18:57:00–18:57:59). Note the regular pattern of two log events every six seconds, which after careful examination of the logs translates to an average of one login attempt every two seconds.

The regularity of these failed login attempts is a strong indicator that the remote system is running a brute-force password-guessing attack utility, such as “medusa.” Such utilities are designed to use a password dictionary to attempt to guess a login password for a remote system. The attack utility is typically configured to run either until the attack is successful or the wordlist is exhausted. Since the SSH server needs time to process each login attempt, brute-force utilities are commonly set to space login attempts by at least one to three seconds, or longer if the attack is intended to be slow and stealthy.

8.5.3 Targeted Accounts

Now that we have clear indication of a brute-force password-guessing attack against the SSH server running on baboon-srv, the next questions are: What accounts were targeted? Was the attack successful?

In Splunk, let’s also define a field called “auth_ssh_target_user,” which contains the username targeted in the remote SSH login attempts (see the “user=” tag in the SSH event logs). We can simply select that field in Splunk and view statistics relating to event logs that contain this field. Figure 8-7 shows that only two accounts were targeted, “root” and “bob,” along with relative percentages of the logs that contain authentication failure messages relating to each account.

image

Figure 8-7 In Splunk, we defined a field called “auth_ssh_target_user,” which contains the username targeted in the remote SSH login attempts. Only two accounts were targeted: “root” and “bob.”

To generate these statistics, we filtered only on event logs containing “auth_ssh_target_user,” which matches events of the following formats:

2011-04-26T18:57:19-06:00 baboon-srv sshd[6433]: pam_unix(sshd:auth):
    authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost
    =172.30.1.77  user=root
2011-04-26T18:57:26-06:00 baboon-srv sshd[6433]: PAM 2 more authentication
    failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=172.30.1.77  user=
    root

As you can see, there are two types of matching events, one that records one login attempt, and the other that records two login attempts. We can use the “grep” and “wc” shell commands to quickly count the number of each type of log for each of the targeted accounts, and calculate a total number of failed login attempts for each targeted account.

As shown in the results below, there were 41 + (2 * 40) = 121 failed login attempts for the “root” account:

$ grep "authentication failure" auth.log | grep "baboon-srv" | grep "user=
    root"  | grep -c "pam_unix(sshd:auth): authentication failure"
41
$ grep "authentication failure" auth.log | grep "baboon-srv" | grep "user=
    root"  | grep -c "PAM 2 more authentication failures"
40

Likewise, there were 29 + (2 * 28) = 85 failed login attempts for the “bob” account.

$ grep "authentication failure" auth.log | grep "baboon-srv" | grep "user=bob
    "  | grep -c "pam_unix(sshd:auth): authentication failure"
29
$ grep "authentication failure" auth.log | grep "baboon-srv" | grep "user=bob
    "  | grep -c "PAM 2 more authentication failures"
28

We can also graph the number of event logs relating to each user over time, as shown in Figure 8-8. Notice that the failed login attempts for the “root” account occur first and are immediately followed by attempts to login to the account “bob.” Again, this fits common activity patterns of brute-force password-guessing utilities, which are often configured with a list of usernames as input, and conduct attacks against each account in series.

image

Figure 8-8 A graph created in Splunk, showing the number of event logs relating to each user over time. The failed login attempts for the “root” account occur first, and are immediately followed by attempts to login to the account “bob.”

8.5.4 Successful Logins

Now that we have strong evidence of a brute-force password-guessing attack, let’s turn our attention to the question of whether the attack was successful.

In the auth.log file, the last failed SSH login attempt against baboon-srv is at 19:04:05, for the account “bob,” as shown below:

$ grep "authentication failure" auth.log | grep "baboon-srv" | grep "sshd" |
    tail -1
2011-04-26T19:04:05-06:00 baboon-srv sshd[6561]: pam_unix(sshd:auth):
    authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost
    =172.30.1.77  user=bob

As you can see below, there are only two successful SSH remote login attempts for baboon-srv, both of which occurred shortly after the long series of failed login attempts ended, and both of which are from the same remote host as the failed login attempts:

$ grep "Accepted password" auth.log | grep "baboon-srv" | grep "sshd"
2011-04-26T19:04:07-06:00 baboon-srv sshd[6561]: Accepted password for bob
    from 172.30.1.77 port 49214 ssh2
2011-04-26T19:04:33-06:00 baboon-srv sshd[6632]: Accepted password for bob
    from 172.30.1.77 port 49215 ssh2

This strongly indicates that the brute-force password-guessing utility found the correct password, and the attacker logged into the system! Judging by the time frame, it is likely that the first successful login was executed by the automated password-guessing utility, since it occurs two seconds after the last failed login attempt, fitting with the previously established pattern. After this, it is fully 26 seconds before the second login attempt. Based on the longer time interval, the second login attempt may have been a manual login executed by the attacker, after seeing that the automated brute-force attack succeeded.

8.5.5 Activity Following Compromise

Let’s take a closer look at the event logs for the time frame after the first successul login to the “bob” account. The snippet of logs below shows the authentication logs sent from baboon-srv to the central login server during the time frame of interest:

2011-04-26T19:04:07-06:00 baboon-srv sshd[6561]: Accepted password for bob
    from 172.30.1.77 port 49214 ssh2
2011-04-26T19:04:07-06:00 baboon-srv sshd[6561]: pam_unix(sshd:session):
    session opened for user bob by (uid=0)
2011-04-26T19:04:08-06:00 baboon-srv sshd[6631]: Received disconnect from
    172.30.1.77: 11:
2011-04-26T19:04:08-06:00 baboon-srv sshd[6561]: pam_unix(sshd:session):
    session closed for user bob
2011-04-26T19:04:33-06:00 baboon-srv sshd[6632]: Accepted password for bob
    from 172.30.1.77 port 49215 ssh2
2011-04-26T19:04:33-06:00 baboon-srv sshd[6632]: pam_unix(sshd:session):
    session opened for user bob by (uid=0)
2011-04-26T19:05:10-06:00 baboon-srv sudo: pam_unix(sudo:auth):
    authentication failure; logname=bob uid=0 euid=0 tty=/dev/pts/0 ruser=
    rhost=  user=bob
2011-04-26T19:05:18-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/vi /var/log/auth.log
2011-04-26T19:05:34-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/sbin/tcpdump -nni eth0
2011-04-26T19:07:03-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/apt-get update
2011-04-26T19:07:15-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/apt-get install nmap
2011-04-26T19:14:53-06:00 baboon-srv sshd[6632]: pam_unix(sshd:session):
    session closed for user bob

After the second successful remote SSH login to the account “bob,” we see an attempt to use the command “sudo.” This is a widely used Linux/UNIX utility for running privileged commands. By default, many Linux systems log both failed and successful uses of the “sudo” command. As you can see below, the first attempt to use the “sudo” command failed. Since “sudo” normally requires that users enter their password to execute privileged commands, this event likely indicates that the attacker typed the wrong password.

2011-04-26T19:05:10-06:00 baboon-srv sudo: pam_unix(sudo:auth):
    authentication failure; logname=bob uid=0 euid=0 tty=/dev/pts/0 ruser=
    rhost=  user=bob

However, in subsequent logs, it is clear that the attacker was able to successfully execute privileged commands using “sudo.” The logs show that the attacker used a text editor, “vi,” to open the authentication logs on the local server:

2011-04-26T19:05:18-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/vi /var/log/auth.log

This strongly implies that the attacker edited the authentication logs stored locally on baboon-srv, probably to conceal his or her tracks. Fortunately, security staff at Bob’s Dry Cleaners also send logs to a remote log collection server, where they are not so easily modified in the event of a compromise!

Next, the attacker ran “tcpdump,” a utility that sniffs traffic on the local network. From the command flags, the attacker did not capture the traffic but merely sent it to the standard output. This may have been useful for quickly determining what types of traffic were sent across the local network, and picking out some addresses in use.

2011-04-26T19:05:34-06:00 baboon-srv sudo:       bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/sbin/tcpdump -nni eth0

Subsequently, the attacker used the APT package management system to install the “nmap” port scanning utility, as shown below. It is likely that other, nonprivileged commands were run before this, which gave the attacker basic information about system configuration and platform. Keep in mind that not all commands were necessarily recorded; we only see records of privileged commands in these logs.

2011-04-26T19:07:03-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/apt-get update
2011-04-26T19:07:15-06:00 baboon-srv sudo:      bob : TTY=pts/0 ; PWD=/home/
    bob ; USER=root ; COMMAND=/usr/bin/apt-get install nmap

Finally, over seven minutes later, we see that the attacker logged out:

2011-04-26T19:14:53-06:00 baboon-srv sshd[6632]: pam_unix(sshd:session):
    session closed for user bob

What happened during those seven minutes for which there are no logs? Did the attacker use nmap to conduct a port scan of the internal network? Since you can use nmap without having to run privileged commands, it is entirely possible that the attacker ran nmap or other utilities without leaving a trail in our auth.log file.

8.5.6 Firewall Logs

Now let’s take a look at our firewall logs and see if we can find evidence of any other activity relating to baboon-srv (10.30.30.20) during the time frame of interest.

In Figure 8-9, we see all of the events in the firewall logs relating to the IP address “10.30.30.20.” As you can see, there is a sudden spike of activity during our time frame of interest, at 19:08.

image

Figure 8-9 A screenshot of Splunk showing a sudden spike of activity during the time frame of interest, at 19:08.

As before, when importing the firewall logs, we took the time to define fields of interest (for Cisco events of type 6-106100, which are the vast majority of the events in firewall.log). In this case, we used regular expressions to define the following fields in firewall.log:

fw_src_ip—Source IP address

fw_dst_ip—Destination IP address

fw_dst_port—Destination port

Zooming in on the spike of activity at 19:08, we can click on the field “fw_src_ip” to retrieve statistics, which show us that the IP address 10.30.30.20 was the source IP address in 98.756% of the logs. Drilling down on the vast majority of activity where 10.30.30.20 was the source IP address, we can see that there was activity targeting a wide variety of ports for a very short period of time, just after 19:08 (see Figure 8-10). This activity pattern is commonly associated with port scanning; port scanners are often configured to send traffic to a wide variety of ports on different destination systems in order to enumerate open services.

image

Figure 8-10 A chart in Splunk showing the number of firewall events targeting certain destination ports, where the source IP address is 10.30.30.20. Only the top 10 most active destination ports are shown, with the rest grouped together as “OTHER.”

In Figure 8-11, you can see that during the four seconds between 19:08:00 and 19:08:04, 10.30.30.20 made connections to over 200 destination IP addresses on over 200 ports. Again, this is a common activity pattern associated with port scanning. This makes sense given that an attacker had installed the “nmap” port scanning utility on 10.30.30.20 just a minute prior.

image

Figure 8-11 As illustrated in Splunk, during the four seconds between 19:08:00 and 19:08:04, 10.30.30.20 made connections to over 200 destination IP addresses on over 200 ports. This is indicative of a port scan.

As shown in Figure 8-10, approximately 45 seconds later (from 19:08:54 to 19:09:00), there was a burst of activity targeting port 3389. In fact, as shown in Figure 8-12, after 19:08:05, 10.30.30.20 sent traffic only to port 3389, although over 100 IP addresses were targeted. This pattern is typical of a port sweep, in which an automated scanner attempts to connect to a specific port on a wide range of remote systems in order to find running instances of a particular service.

image

Figure 8-12 A screenshot of Splunk showing activity after 19:08:05 relating to source IP address 10.30.30.20 in the firewall logs. Notice that the only destination port targeted was port 3389, even through there were over 100 destination IP addresses targeted. This is indicative of a port sweep.

Port 3389 (TCP) is commonly associated with the Remote Desktop Protocol (RDP), often used for making remote connections to Windows workstations and servers. Examining the traffic to destination port 3389 over all time, we can see that the system 192.168.30.101 received four connections on port 3389, two more than any other destination IP address (see Figure 8-13, which shows the top 10 destination IP addresses for connections on port 3389).

image

Figure 8-13 Splunk screenshot showing the top 10 destination IP addresses for connections on port 3389. The system 192.168.30.101 received four connections on port 3389, two more than any other destination IP address.

Pulling out all logs relating to destination 192.168.30.101:3389, we see the following four events:

$ grep '192.168.30.101(3389)' firewall.log
2011-04-26T19:08:58-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(49814) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:09:37-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50215) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:09:37-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50216) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:10:47-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50217) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]

These logs indicate that the firewall allowed four connections from 10.30.30.20 to 192.168.30.101 on port 3389, which is commonly associated with RDP. Were any of these connections successful?

8.5.7 The Internal Victim—192.30.1.101

Let’s take a look at the workstation logs associated with 192.168.30.101. Based on information from Bob’s Dry Cleaners’ security staff, this system has the hostname “dog-ws.” Interestingly, as you can see below, at 19:11:08 there was a successful logon (event type 528) for the user account “bob.” This was just seconds after the firewall logged a remote connection attempt from 10.30.30.20 to 192.168.30.101 (and there certainly could have been a small time skew between the firewall and workstation). Notice in the log excerpt below, the field that reads “Logon Type: 10.” According to Microsoft, logon type 10 is titled “RemoteInteractive,” and indicates “A user logged on to this computer remotely using Terminal Services or Remote Desktop.”37

37. “Audit logon events: Security Configuration Editor; Security Services,” Microsoft, January 21, 2005, http://technet.microsoft.com/en-us/library/cc787567(WS.10).aspx.

$ grep 528 workstations.log | grep -i dog-ws
2011-04-26T19:11:08-06:00 dog-ws MSWinEventLog#0111#011Security#011754#011Tue
     Apr 26 19:11:01 2011#011528#011Security#011bob#011User#011Success Audit
    #011DOG-WS#011Logon/Logoff#011#011Successful Logon:     User Name: bob
        Domain: DOG-WS     Logon ID: (0x0,0x155A04D)     Logon Type: 10
    Logon Process: User32       Authentication Package: Negotiate
    Workstation Name: DOG-WS     Logon GUID:
    {00000000-0000-0000-0000-000000000000}  #011698

Notice also that the targeted account was “bob,” which is the same as the likely compromised account on 10.30.30.20.

At this point, we have evidence that a remote attacker broke into 10.30.30.20 by guessing the SSH password for the account “bob.” From there, the attacker conducted network port scans and sweeps, and then likely connected to 192.168.30.101 using the account “bob” via RDP.

Examining subsequent activity for the account “bob” on dog-ws, we see that shortly after logon, “bob” executed a command shell:

2011-04-26T19:11:52-06:00 dog-ws MSWinEventLog#0110#011Security#011774#011Tue
     Apr 26 19:11:25 2011#011592#011Security#011bob#011User#011Success Audit
    #011DOG-WS#011Detailed Tracking#011#011A new process has been created:
        New Process ID: 2676     Image File Name: C:WINDOWSsystem32cmd.exe
        Creator Process ID: 2136     User Name: bob     Domain: DOG-WS
    Logon ID: (0x0,0x155A04D)    #011715

Subsequently, “bob” executed the command “ftp.exe”:

2011-04-26T19:11:52-06:00 dog-ws MSWinEventLog#0110#011Security#011776#011Tue
     Apr 26 19:11:52 2011#011592#011Security#011bob#011User#011Success Audit
    #011DOG-WS#011Detailed Tracking#011#011A new process has been created:
        New Process ID: 2716     Image File Name: C:WINDOWSsystem32ftp.exe
        Creator Process ID: 2676     User Name: bob     Domain: DOG-WS
    Logon ID: (0x0,0x155A04D)    #011717

FTP is a program commonly used to transfer files between remote systems. This is certainly a concern, as the attacker could have used this program to transfer confidential data to external systems.

Let’s examine the firewall logs again to see if 192.168.30.101 made any other connections of interest. The output below shows all firewall logs associated with “192.168.30.101” (“dog-ws”). Notice that the logs begin with a variety of connection attempts from 10.30.30.20, during the time frame for which we identified port scanning activity (19:08:00 and 19:08:02). After that, you can see more connection attempts from 10.30.30.20 to port 3389, during the time frame that we identified a port sweep for port 3389 (19:08:58). Shortly thereafter, we see a connection at approximately the same time that a successful remote logon for the account “bob” was logged in workstations.log (19:10:47).

$ grep '192.168.30.101' firewall.log
2011-04-26T19:08:00-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(36570) -> inside/192.168.30.101(80) hit-cnt 1 first
    hit [0x26ae55dd, 0x0]
2011-04-26T19:08:01-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(36943) -> inside/192.168.30.101(80) hit-cnt 1 first
    hit [0x26ae55dd, 0x0]
2011-04-26T19:08:02-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(47207) -> inside/192.168.30.101(443) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:08:02-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(47276) -> inside/192.168.30.101(443) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:08:58-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(49814) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:09:37-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50215) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:09:37-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50216) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:10:47-06:00 ant-fw : %ASA-6-106100: access-list dmz permitted
    tcp dmz/10.30.30.20(50217) -> inside/192.168.30.101(3389) hit-cnt 1 first
    hit [0xda142b8f, 0x0]
2011-04-26T19:11:39-06:00 ant-fw : %ASA-6-106100: access-list inside
    permitted tcp inside/192.168.30.101(1399) -> outside/172.30.1.77(21) hit-
    cnt 1 first hit [0x2989a4a8, 0x0]

Finally, at 19:11:39 we see a permitted FTP connection from 192.160.30.101 to 172.30.1.77(21), the external attacker system! (This is also approximately the same time as the FTP event we found associated with the account “bob” in workstations.log.)

8.5.8 Timeline

Based on our event log analysis, we can create a working hypothesis of what likely transpired. Of course, we must make some educated guesses in order to interpret the activity we have seen. However, our analysis is strongly supported by the evidence, references, and experience. With that in mind, here is a timeline of events for the incident at Bob’s Dry Cleaners on April 26, 2011:

17:17:01—auth.log begins

18:47:57—firewall.log begins

18:50:59—workstations.log begins

18:56:50–19:04:05—A series of failed login attempts from 172.30.1.77 to 10.30.30.20, targeting the accounts “root” and “bob,” as recorded in auth.log. The regular pattern of login attempts and volume of attempts indicates the use of a brute-force password-guessing utility against the exposed SSH server.

19:04:07—Successful remote login via SSH from 172.30.1.77 to 10.30.30.20, for account “bob,” as recorded in auth.log.

19:04:08—SSH connection closed for account “bob” on 10.30.30.20, as recorded in auth.log.

19:04:33—Successful remote login via SSH from 172.30.1.77 to 10.30.30.20, for account “bob,” as recorded in auth.log.

19:05:10—User “bob” attempts to run a privileged command using sudo, but fails to successfully authenticate, as recorded in auth.log.

19:05:18—User “bob” uses “sudo” to open the local authentication log, auth.log, using the “vi” text editor on 10.30.30.20, as recorded in auth.log.

19:05:34—User “bob” uses “sudo” to run the sniffer “tcpdump” on 10.30.30.20, as recorded in auth.log.

19:07:15—User “bob” uses “sudo” to install nmap on 10.30.30.20, as recorded in auth.log.

19:08:00–19:08:04—A burst of permitted activity from 10.30.30.20 targeting over 200 destination ports on approximately 200 IP addresses, as recorded in firewall logs. Indicative of a port scan.

19:08:54–19:09:00—A burst of permitted activity from 10.30.30.20 targeting port 3389 on approximately 200 IP addresses, as recorded in firewall logs. Indicative of a port sweep.

19:10:47—Permitted connection from 10.30.30.20 to 192.168.30.101:3389 (RDP) recorded in firewall logs.

19:11:08—Successful remote logon for the user account “bob” on dog-ws (192.168.30.101) recorded in workstations.log.

19:11:39—Permitted outbound FTP connection from 192.160.30.101 to 172.30.1.77(21) recorded in firewall logs.

19:11:52—Command “cmd.exe” executed by “bob” on dog-ws (192.168.30.101), as recorded in workstations.log.

19:11:52—Command “ftp.exe” executed by “bob” to dog-ws (192.168.30.101), as recorded in workstations.log.

19:14:53—SSH session closed from 172.30.1.77 to 10.30.30.20 for user “bob,” as recorded in auth.log.

19:17:01—auth.log ends.

19:28:24—firewall.log ends.

19:45:46—workstations.log ends.

8.5.9 Theory of the Case

Now that we have put together a timeline of events, let’s summarize our theory of the case. Again, this is a working hypothesis strongly supported by the evidence, references, and experience:

• The attacker (172.30.1.77) launched a brute-force password-guessing attack against the external SSH server, 10.30.30.20 (baboon-srv). There were 121 login attampts for the account “root,” which all failed. Subsequently, there were 85 failed login attempts for the account “bob.” The 86th login attempt was successful.

• The attacker logged into the victim server, 10.30.30.20 (baboon-srv), using SSH.

• On 10.30.30.20 (baboon-srv), the attacker found that the “bob” account had the ability to run privileged commands using “sudo.” The attacker used this to edit local authentication logs, sniff internal network traffic, and install the “nmap” port scanning utility.

• On 10.30.30.20 (baboon-srv), the attacker ran the nmap port scanning utility. First, the attacker conducted a port scan of the internal network and, later, the attacker conducted a port sweep for port 3389 (RDP).

• Pivoting through 10.30.30.20 (baboon-srv), the attacker logged on to the internal Windows workstation 192.168.30.101 (dog-ws) as “bob” using RDP. Judging by the timing and prior events, it is likely that the attacker was able to logon by reusing the password for the “bob” account, which was discovered during the brute-force password-guessing attack on 10.30.30.20.

• On 192.168.30.101 (dog-ws), the attacker used FTP to connect outbound directly to the external system 192.168.1.77. It is likely that any data transfer was conducted in FTP passive mode, as no port 20 connection was logged.

8.5.10 Response to Challenge Questions

Now, let’s answer the investigative questions posed to us at the beginning of the case.

Evaluate whether the failed login attempts were indicative of a deliberate attack. If so, identify the source and the target(s). The failed login attempts clearly showed a regular pattern of one login attempt approximately every two seconds. The short periodicity and regular pattern of these login attempts are indicative of an automated client process configured to make regular login attempts. The first targeted account was “root,” the default administrator account on most Linux systems, and the second targeted account was “bob,” a reasonable guess for a company named “Bob’s Dry Cleaners.” Combined, there were over 200 login attempts within 8 minutes. All of the login attempts originated from an external system, 172.30.1.77, and targeted 10.30.30.20.

The volume and pattern of the failed login attempts clearly indicate that the login attempts were likely part of a deliberate attack to break into the exposed SSH service on 10.30.30.20. The failed login attempts stopped as soon as a successful login was made from 172.30.1.77 to 10.30.30.20, using the account “bob.”

To summarize, the failed login attempts were likely the result of a deliberate attack. The target of the attack was 10.30.30.20 (baboon-src) and the source was 172.30.1.77. The targeted user accounts were “root” and “bob.”

Determine whether any systems were compromised. If so, describe the extent of the compromise.

Given that the brute-force password-guessing attack against 10.30.30.20 (baboon-srv) ended with a successful login using the account “bob,” it is likely that this system and the “bob” account were compromised. Furthermore, we have seen evidence that the attacker subsequently pivoted through 10.30.30.20 and logged into the internal workstation 192.168.30.101 (dog-ws), again using the account “bob.” The internal workstation 192.168.30.101 then made a direct connection over port 21 (FTP) to the external attacker system, 172.30.1.77. Based on this activity, it is wise to assume that 192.168.30.101 (dog-ws) is also compromised, and that files may have been uploaded or downloaded using FTP.

Any accounts with credentials stored on either 10.30.30.20 (baboon-srv) or 192.168.30.101 (dog-ws) should also be assumed to be compromised, since the attacker could have downloaded hashed passwords to run through a password-cracking utility.

8.5.11 Next Steps

Using event logs from servers, workstations, and the firewall, we have learned a significant amount about the events that likely transpired on April 26, 2011. As forensic investigators, where do we go from here? Although this depends on the resources and goals of the organization initiating the investigation, often the next steps involve a two-pronged attack of gathering additional evidence, and containing/eradicating any compromise. In this case, common next steps might include:

Containment/Eradication: What can Bob’s Dry Cleaners do to contain the damage and prevent further compromise? Here are a few options:

– Change all passwords that may have been compromised. This includes any passwords related to the DMZ victim, 10.30.30.20, and the internal system 192.168.30.101. Organizations that “play it safe” may choose to reset all passwords.

– Rebuild the two compromised systems, 10.30.30.20 and 192.168.30.101 (after gathering evidence from them as needed).

– Tighten firewall rules to more strictly limit access from the DMZ to the internal network. Is there really a need for DMZ systems to access internal systems? Limit this access to the greatest extent possible.

– Block outbound connections on TCP port 21 (FTP), and any other ports that are not needed.

– Remove or restrict access to the externally exposed SSH service, if possible.

– Consider using two-factor authentication for external access to the network. Single-factor authentication is risky and leaves the system at much higher risk of compromise due to brute-force password guessing, as we have seen.

Additional Sources of Evidence: Here are some high-priority potential sources of additional evidence that might be useful in the case:

– Flow records—If Bob’s Dry Cleaners is collecting flow record data from the firewall, this may help to determine the size and directionality of the data transfer between 192.168.30.101 and 172.30.1.77.

– Hard drives of compromised systems—Forensic analysis of the compromised system hard drive may reveal detailed information about the attacker’s activities or, at the very least, allow for an inventory of confidential information that may have been compromised.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset