Chapter 10: Basic Real-World Scenarios

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

10
BASIC REAL-WORLD SCENARIOS

Beginning with this chapter, we’ll dig into the meat of packet analysis as we use Wireshark to analyze real-world network problems. I’ll introduce a series of problem scenarios by describing the context of the problem and providing the information that was available to the analyst at the time. Having laid the groundwork, we’ll turn to analysis as I describe the method used to capture the appropriate packets and step you through the process of working toward a diagnosis. Once analysis is complete, I’ll point toward potential solutions and give an overview of the lessons learned.

Throughout, remember that analysis is a very dynamic process. Thus, the methods I use to analyze each scenario may not be the same ones that you would use. Everyone approaches problem solving and reasoning through their own lens. The most important thing is that the result of the analysis solves a problem, but even when it doesn’t, it’s critical to learn from failures as well. Experience is the thing we get when we don’t get what we want, after all.

In addition, most problems discussed in this chapter can probably be solved with methods that don’t necessarily involve a packet sniffer, but what’s the fun in that? When I was first learning how to analyze packets, I found it helpful to examine typical problems in atypical ways by using packet analysis techniques, which is why I present these scenarios to you.

Missing Web Content

http_espn_fail.pcapng

In the first scenario we’ll look at, our user is Packet Pete, a college basketball fan who doesn’t keep late hours and usually misses the West Coast games. The first thing he does when he sits down at his workstation every morning is visit http://www.espn.com/ for the previous night’s final scores. When Pete browses to ESPN this morning, he finds that the page is taking a long time to load, and when it finally does, most of the images and content are missing (Figure 10-1). Let’s help Pete diagnose this issue.

Figure 10-1: ESPN is failing to load properly.

Tapping into the Wire

This issue is isolated to Pete’s workstation and is not affecting any others, so we’ll start by capturing packets directly from there. To do this, we’ll install Wireshark and capture packets while browsing to the ESPN website. Those packets are found in the file http_espn_fail.pcapng.

Analysis

We know Pete’s issue is that he’s unable to view a website he is browsing to, so we’re primarily going to be looking at the HTTP protocol. If you read the previous chapter, you should have a basic understanding of what HTTP traffic between a client and server looks like. A good place to start looking is at the HTTP requests being made to the remote server. You can do this by applying a filter for GET requests (using http.request.method == "GET"), but this can also be done by simply selecting Statistics ▶ HTTP ▶ Requests from the main drop-down menu (Figure 10-2).

Figure 10-2: Viewing HTTP requests to ESPN

From this overview, it appears the capture is limited to seven different HTTP requests, and they all look like they are associated with the ESPN website. Each request contains the string espn within the domain name, with the exception of cdn.optimizely.com, which is a content delivery network (CDN) used to deliver advertising to a multitude of sites. It’s common to see requests to various CDNs when browsing to websites that host advertisements or other external content.

With no clear leads to follow, the next step is to look at the protocol hierarchy of the capture file by selecting Statistics ▶ Protocol Hierarchy. This will allow us to spot unexpected protocols or peculiar distributions of traffic per protocol (Figure 10-3). Keep in mind that the protocol hierarchy screen is based on the currently applied display filter. Be sure to clear the previously applied filter to get the expected results based on the entire packet capture.

Figure 10-3: Reviewing the protocol hierarchy of the browsing session

The protocol hierarchy isn’t too complex, and we can quickly decipher that there are only two application-layer protocols at work: HTTP and DNS. As you learned in Chapter 9, DNS is used to translate domain names to IP addresses. So, when you browse to a site like http://www.espn.com/, your system may need to send out a DNS query to find the IP address of the remote web server if it doesn’t already know it. Once a DNS reply with the appropriate IP address comes back, that information can be added to a local cache, and HTTP communication (using TCP) can commence.

Although nothing looks out of the ordinary here, the 14 DNS packets are notable. A DNS request for a single domain name is typically contained in a single packet, and the response also constitutes a single packet (unless it’s very large, in which case DNS will utilize TCP). Since there are 14 DNS packets here, it’s possible that as many as seven DNS queries were generated (7 queries + 7 replies = 14 packets). Figure 10-2 did show HTTP requests to seven different domains, but Pete only typed a single URL into his browser. Why are all of these extra requests being made?

In a simple world, visiting a web page would be as easy as querying one server and pulling all of its content in a single HTTP conversation. In reality, an individual web page may provide content hosted on multiple servers. All of the text-based content could be in one place, the graphics could be in another, and embedded videos could be in a third. That doesn’t include ads, which could be hosted on multiple providers spanning dozens of individual servers. Whenever an HTTP client parses HTML code and finds a reference to content on another host, it will attempt to query that host for the content, which can generate additional DNS queries and HTTP requests. This is exactly what happened here when Pete visited ESPN. While he may have intended to view content only from a single source, references to additional content were found in the HTML code, and his browser automatically requested that content from multiple other domains.

Now that we understand why all of these extra requests exist, our next step is to examine the individual conversations associated with each request (Statistics ▶ Conversations). Reviewing the Conversations window (Figure 10-4) provides an important clue.

Figure 10-4: Reviewing IP conversations

We discovered earlier that there were seven DNS requests and seven HTTP requests to match. With that in mind, it would be reasonable to expect that there would also be seven matching IP conversations, but that isn’t the case. There are eight. How can that be explained?

One thought might be that the capture was “contaminated” by an additional conversation unrelated to the problem at hand. Ensuring your analysis doesn’t suffer due to irrelevant traffic is certainly something you should be cognizant of, but that isn’t the issue with this conversation. If you examine each HTTP request and note the IP address the request was sent to, you should be left with one conversation that doesn’t have a matching HTTP request. The endpoints for this conversation are Pete’s workstation (172.16.16.154) and the remote IP 203.0.113.94. This conversation is represented by the bottom line in Figure 10-4. We note that 6,774 bytes were sent to this unknown host but zero bytes were sent back: that’s worth digging into.

If you filter down into this conversation (right-click the conversation and choose Apply As Filter ▶ Selected ▶ A<->B), you can apply your knowledge of TCP to identify what’s gone wrong (Figure 10-5).

Figure 10-5: Reviewing the unexpected connection

With normal TCP communication, you expect to see a standard SYNSYN/ACK-ACK handshake sequence. In this case, Pete’s workstation sent a SYN packet to 203.0.113.94, but we never see a SYN/ACK response. Not only this, but Pete’s workstation sent multiple SYN packets to no avail, eventually leading his machine to send TCP retransmission packets. We’ll talk more about the specifics of TCP retransmissions in Chapter 11, but the key takeaway here is that one host is sending packets that it never receives a response to. Looking at the Time column, we see that the retransmissions continue for 95 seconds without a response. In network communications, this is slower than molasses.

We have identified seven DNS requests, seven HTTP requests, and eight IP conversations. Since we know that the capture is not contaminated with extra data, it’s reasonable to think that the mysterious eighth IP conversation is probably the source of Pete’s slowly and incompletely loading web page. For some reason, Pete’s workstation is trying to communicate with a device that either doesn’t exist or just isn’t listening. To understand why this is happening, we won’t look at what’s in the capture file; instead, we’ll consider what isn’t there.

When Pete browsed to http://www.espn.com/, his browser identified resources hosted on other domains. To retrieve that data, his workstation generated DNS requests to find their IP addresses, then connected to them via TCP so that an HTTP request for the content could be sent. For the conversation with 203.0.113.94, there is no DNS request to be found. So, how did Pete’s workstation know about that address?

If you remember our discussion about DNS in Chapter 9 or are otherwise familiar with it, you know that most systems implement some form of DNS caching. This allows them to reference a local DNS-to-IP address mapping that has already been retrieved without having to generate a DNS request every time you visit a domain that you frequently communicate with. Eventually, these DNS-to-IP mappings expire, and a new request must be generated. However, if a DNS-to-IP mapping changes and a device doesn’t generate a DNS request to get the new address when visiting the next time, the device will attempt to connect to an address that is no longer valid.

In Pete’s case, that is exactly what happened. Pete’s workstation already had a cached DNS-to-IP mapping for a domain that hosts content for ESPN. Since this cached entry exists, a DNS request was not generated, and his system attempted to go ahead and connect to the old address. However, that address was no longer configured to respond to requests. As a result, the requests timed out, and the content never loaded.

Fortunately for Pete, clearing his DNS cache manually is possible with a few keystrokes on the command line or in a terminal window. Alternatively, he could also just try again in a few minutes when the DNS cache entry will probably have expired so a new request will be generated.

Lessons Learned

That’s a lot of work just to find out that Kentucky beat Duke by 90 points, but we walk away with a deeper understanding of the relationship between network hosts. In this scenario, we were able to work toward a solution by assessing multiple data points related to the requests and conversations occurring within the capture. From there, we were able to spot a few inconsistencies that took us down a path toward finding the failed communication between the client and one of ESPN’s content delivery servers.

In the real world, diagnosing problems is rarely as simple as scrolling through a list of packets and looking for the ones that look funny. Troubleshooting even the simplest problems can result in very large captures that rely on the use of Wireshark’s analysis and statistics features to spot anomalies. Getting familiar with this style of analysis is critical to successful troubleshooting at the packet level.

If you’d like to see an example of what normal communication looks like between a web browser and ESPN, try browsing to the site while capturing traffic yourself and see if you can identify all of the servers responsible for delivering content.

Unresponsive Weather Service

weather_broken.pcapng weather_working.pcapng

Our second scenario once again involves our pal Packet Pete. Among his many hobbies, Pete fancies himself an amateur meteorologist and doesn’t go more than a few hours without checking current conditions and the forecast. He doesn’t rely solely on the local news forecast though; he actually runs a small weather station outside his home that reports data up to https://www.wunderground.com/ for aggregation and viewing. Today, Pete went to check his weather station to see how much the temperature had dropped overnight, but found that his station hadn’t reported in to Wunderground in over nine hours, since around midnight (Figure 10-6).

Figure 10-6: The weather station hasn’t sent a report in nine hours.