We’re about halfway through the chapter now, so I think it’s a good time for a quick “midterm” test.
The infamous web hacker Miss Black Cat is visiting Dave’s photo gallery site, looking around for some interesting vulnerabilities she can exploit. She starts at the page www.photos.cxx/welcome.php. When she views the HTML source of the page—as all good attackers always do—she sees the following code:
Question: Which page is Miss Black Cat most likely to visit next in her search for vulnerabilities?
a. photos.php
b. vote.php
c. suggestion.php
d. problem.php
Answer: None of the above! (Yes, I know this was an unfair trick question.) Miss Black Cat is a very savvy attacker, and she knows that her choices are never limited to just what the site developers meant for her. Instead of following one of these links to get to her next page, it’s likely that she would try typing one of the following addresses into her browser just to see if any of them really exist:
www.photos.cxx/private_photos.php
www.photos.cxx/personal_photos.php
This is a very similar kind of attack to the file extension guessing attack we talked about in the previous section; in fact, both of these attacks are types of a larger category of web application attacks referred to as forceful browsing.
Forceful browsing attacks aren’t always necessarily completely “blind,” as in the examples we just showed where the attacker guessed for a page called admin.php. Sometimes an attacker might have a little better suspicion that an unreferenced file does exist on the server, just waiting to be uncovered.
If an attacker sees that a page has a reference to a resource like www.photos.cxx/images/00042.jpg or www.photos.cxx/stats/05152011.xlsx, then he’s likely to try browsing for files with names close to those. The file “00042.jpg” looks as if it might be named based on an incrementing integer, so he might try “00041.jpg” or “00043.jpg.” Likewise, “02142012.xlsx” looks as if it might be named based on a date (February 14, 2012). So, he might try forcefully browsing for other files with names like “02152011.xlsx.” Filenames like these are dead giveaways that there are other similar files in the same folder.
This attack is essentially the same insecure direct object reference attack that we covered in Chapter 7, except in this case it’s an attack against the file system and not an attack against a database index. The solution to the problem is also essentially the same: Ensure that you’re applying proper access authorization on a resource-by-resource basis. If all of the files in a particular directory are meant to be publicly accessible even though they’re not necessarily linked into the rest of the application, then you’re fine as-is. If some of those files need to remain private, then move them into a separate directory and require appropriate authentication and authorization to get to them.
If Cat’s guesses at filenames turn up nothing, then she may move on to try some guesses at common directory names:
Also, the insecure direct object reference attack can be a very effective way to find hidden directories. If there’s a “jan” folder and a “feb” folder, there are probably “mar” and “apr” folders too. And again, this is not necessarily a bad thing. In fact, some Model-View-Controller (MVC) application architectures intentionally work this way, so that a user can search for a particular range of data by manually changing a date string in a URL (for example, “201104” to “201105”). But whether to expose this data or not should be up to you, not to an attacker.
If any of the different common directory names or date/integer object reference manipulations that Cat tries comes back with anything besides an HTTP 404 “Not Found” error, she’s in business. She’ll be happy if she just gets redirected to a real page; for example, if her request for www.photos.cxx/test/ redirects her to the default page www.photos.cxx/test/default.php. This means that she’s found an area of the web site she wasn’t supposed to be in and whose only defense was probably the fact that nobody outside the organization was supposed to know that it existed. This page and this directory are unlikely to have strong authentication or authorization mechanisms, and probably won’t have proper input validation on the controls either. Why would developers bother hardening code that only they use? (Or so they thought....)
While this would be good for Cat, what she really wants is for the server to return a directory listing of the requested folder. You can see in Figure 8-6 what a directory listing would look like. It’s basically the same thing you’d see if you opened a folder in your OS on your local machine: It shows all the files and subdirectories present in that folder. If an attacker gets this, he won’t have to keep making random blind guesses; he’ll know exactly what’s there. Always configure your web server to disable directory listings.
Figure 8-6 A directory listing of a guessed folder
Status Code Deltas
We said just a minute ago that our attacker Cat was looking for any HTTP status code result from her probing attacks besides 404 “Not Found.” To be a little more accurate, it’s not so much that she’d be looking for a certain status code, but more that she’d be looking for a change or delta between status codes.
For example, let’s say she looked for a completely random folder that would be almost 100 percent guaranteed not to exist, something like www.photos.cxx/q2o77xz4/. If the server returns an HTTP 403 “Forbidden” response code, that doesn’t necessarily mean that the folder exists and that she’s stumbled upon a secret hidden directory. It could just mean that the server has been configured to always return 403 Forbidden for nonexistent directories.
On the other hand, if a request for www.photos.cxx/q2o77xz4/ turns up a 404 Not Found response but a request for www.photos.cxx/admin/ comes back with 403 Forbidden or 401 Unauthorized, then that’s a good sign that the /admin directory does actually exist. From Cat’s perspective, this is nowhere near as useful as getting an actual directory listing, but it may help with some other attacks such as a directory traversal.
The final form of forceful browsing we’ll be talking about is less commonly seen than the others, but still very dangerous when you do see it. Sometimes developers write web applications with an implicit workflow in mind. They might assume that users will first visit their welcome page, then view some catalog item pages, maybe put some items in a shopping cart, and then check out and pay. Of course, without explicit checks in place, users can visit pages in any order they want. Here’s an example of what might go wrong for an application developer when an attacker decides to manipulate an implicit workflow.
So many people have loved the photos in Dave’s photo gallery application that he’s decided to make it into a side business and sell prints. On the page www.photos.cxx/view_photo.php, he adds a new button “Buy a Print” that redirects users to www.photos.cxx/buy_print.php. Once they’ve chosen the size of print that they want, along with any matting or framing options, they get redirected again to www.photos.cxx/billing.php. Here, they give their credit card information so Dave can bill them for their new artwork. Finally, they get redirected to www.photos.cxx/shipping.php where they enter their shipping address. Figure 8-7 shows how this application workflow flows—or at least, how it flows for legitimate, honest users.
Figure 8-7 The legitimate www.photos.cxx print purchase workflow
Unfortunately, while Dave is a very good photographer, his web application security skills are not quite up to the same level of ability. In this case, Dave just assumed that users would follow his implicit workflow, moving from page A to page B to page C the way he intended them to. But he never added any code to ensure this. Miss Black Cat (being a photography lover herself) comes into Dave’s gallery, picks a photo she likes on view_photo.php, chooses her print options on buy_print.php, but then skips completely over the billing.php page to go straight to shipping.php. (Cat may be keen on photography, but she was never very big on actually paying for things.) Figure 8-8 shows how Cat bypasses the intended workflow by forcefully browsing her way through the application.
Figure 8-8 An attacker exploits a forceful browsing vulnerability in the www.photos.cxx print purchase workflow.
Again, you won’t often see forceful browsing vulnerabilities like this in traditional thin-client “Web 1.0” applications, but they are a little more common in RIAs. Ajax and Flex client-side modules sometimes make series of asynchronous calls back to their server-side components. If the application is implicitly relying on these calls happening in a certain order (that is, the “choosePrint” call should be made before the “enterBillingInfo” call, which should be made before the “enterShippingInfo” call), then the exact same type of vulnerability can occur.