Defining the Same-Origin Policy

The same-origin policy is essentially an agreement among browser manufacturers—mainly Microsoft, Apple, Google, Mozilla and Opera—on a standard way to limit the functionality of scripting code running in users’ web browsers. You might wonder why this is a good thing and why we would want any limits on scripting functionality. If so, don’t worry; we’ll go into this in detail in the next section. Until then, please trust us that without the same-origin policy, the World Wide Web would be more like a Wild West Web where anything would go, no data would be safe, and you’d never even think about using a credit card to buy something there.

In short, the same-origin policy states that when a user is viewing a web page in his browser, script running on that web page should only be able to read from or write to the content of another web page if both pages have the same “origin.” It goes on to define “origin” as a combination of the page’s application layer protocol (HTTP or HTTPS), its TCP port (usually 80 for HTTP or 443 for HTTPS), and its domain name (that is, “www.amazon.com” or “www.mhprofessional.com”).

This concept is easier to explain with an example. Let’s say you’re getting bored with your desktop background photo and you want to look for a new one, so you open your browser to the page http://www.flicker.cxx/galleries/. If there were any scripting code running on this page, which other web pages could that script read from?

image https://www.flicker.cxx/galleries/
No, the script could not read from this page: the protocols between the two pages are different (HTTP vs. HTTPS).

image http://www.photos.cxx/galleries
www.photos.cxx and www.flicker.cxx are completely different domains. The script could not read from this page.

image http://my.flicker.cxx/galleries/
This one is a little trickier, but it still won’t work: my.flicker.cxx is not the same domain as www.flicker.cxx.

image http://flicker.cxx/galleries/
This is another tricky one, but removing part of the domain name does not change the fact that the domains are different. This still won’t work.

image http://mirror1.www.flicker.cxx/galleries/
Adding to the domain name doesn’t change the fact either. The script could not read from this page.

image http://www.flicker.cxx:8080/galleries/
The answer to this one is actually a little complicated. If you’re using any browser besides Microsoft’s Internet Explorer (IE), then the script could not read from that page, because the TCP ports between the two pages are different. (Remember that for HTTP pages, if you don’t explicitly specify a port, the default value is 80. So http://www.flicker.cxx/ is actually the same page as http://www.flicker.cxx:80/.) However, IE acts a little differently from the other major browsers: IE does not use a page’s port as part of its origin; instead, it uses the page’s defined security zone, which is either “Internet,” “Local intranet,” “Trusted site,” or “Restricted site.” The user can configure which sites fall into which zones, so if you’re using IE, then the script could potentially read from this page, depending on the user’s setup.

image http://www.flicker.cxx/favorites
In this case, all three of the important URL attributes are the same: the protocol (HTTP), the port (80), and the domain (www.flicker.cxx). So the answer is “yes,” the script could read from this page. The fact that it’s a different directory doesn’t make any difference to the same-origin policy.

Table 5-1 shows the results of attempting a scripting request from http://www.flicker.cxx/galleries/ to specific URLs.

image

Table 5-1 The Same-Origin Policy as Applied to a Hypothetical Page http://www.flicker.cxx/galleries/

An Important Distinction: Client-Side vs. Server-Side

It’s important to note that the same-origin policy has absolutely no effect on what pages or sites any server-side code can access. The server at www.flicker.cxx is free to make requests to my.flicker.cxx, mirror1.www.flicker.cxx, google.com, or even any intranet sites that the company may have that aren’t directly accessible via the Internet. The same-origin policy only applies to browsers running client-side scripting code. You might be wondering what the difference is, and why browsers would go to the trouble of implementing a restriction like this when there are no restrictions at all for the server code. The answer is simple: cookies.

As we saw in the last chapter, any time you visit a page in a web browser, the browser automatically sends all the cookies it’s saved for that site along with your request for the page. So, for example, when I visit www.amazon.com, my browser sends a cookie back with my request (in this particular case, the cookie’s name is “x-main”) with a value that uniquely identifies me to Amazon (let’s say the value is “12345”). You can see a diagram of this in Figure 5-1.

image

Figure 5-1 Every request I make to www.amazon.com automatically includes the cookie “x-main.”

Since I’m the only person in the world with a www.amazon.com x-main cookie value of “12345,” the Amazon server knows that I’m the person visiting the site, and it personalizes the content accordingly. This is why when I visit Amazon, I get a completely different page than when you visit Amazon. It shows a banner with my name (“Hello Bryan. We have recommendations for you.”), it shows that I’m a member of their Amazon Prime free shipping club, and since I’ve recently ordered books on Objective-C programming, it shows me pictures and information for other programming books I might want to buy.

However, if I set up a web application (for example, “www.bryanssite.cxx”) and program its server-side code to call out to www.amazon.com, the response it gets back from Amazon won’t have any of my personal information, because the request it sends out won’t have my cookie value. Figure 5-2 shows this in action. Even if I’m the one making the request to www.bryanssite.cxx, my browser is only going to send the cookies for www.bryanssite.cxx. It won’t send the cookies for www.amazon.com or www.google.com or for any other site.

image

Figure 5-2 Requests made from server to server don’t include the user’s cookies for the target web site.

So the main point of the same-origin policy is not to prevent web applications from reading resources from other sites, but rather to prevent web applications from reading personalized (or more specifically, credentialed), potentially sensitive and private resources from other sites. Now that we have a clearer understanding of what the same-origin policy is, let’s take a closer look at why we need it.

A World Without the Same-Origin Policy

The one thing that application developers and attackers may have in common is a shared loathing of the same-origin policy, and for exactly the same reason: it keeps them from getting to other sites’ data. Their reasons for wanting this data may vary: the attacker wants data so he can sell it, and the developer may just have an idea for creating a new mashup (a web site combining the functionality of two or more other sites; for example, TwitterVision is a mashup of Twitter and Google Maps that shows you where tweets are coming from in real time). But for better and worse, the same-origin policy gets in the way in both cases.

To demonstrate what we mean about both developers and attackers hating the same-origin policy, let’s imagine a world without it. In this world, an online florist—Amy’s Flowers—is building their new web site. Amy’s Flowers knows that a lot of people buy flowers for their mother for her birthday. So whenever you visit www.amysflowers.cxx, the client-side page script makes a request to get your personalized page at http://calendar.google.com. The script looks through the Google Calendar page contents to see if you have an upcoming event titled something like “Mom’s Birthday.”

If the script finds your mom’s birthday, its next step is to check your e-mail to see if maybe you forgot about it last year. It makes requests to gmail.com, mail.yahoo.com, and hotmail.com to see if it gets back personalized pages from any of those services. If so, the script looks through the mailbox data in those pages for messages from last year around your mom’s birthday containing words like “disappointed” or “hurt” or “neglected.”

The script’s final step is to check to see how much money you have in your bank account, so that Amy’s Flowers can offer you an appropriately priced bouquet for your budget. Again, the script makes requests to Bank of America, Chase, and Wells Fargo to look for personalized page responses. It sees that you’ve just paid your car payment, so you don’t have much money in your account right now, but it also sees that you’re due to get paid in the next week and you’ll have more money then. Now Amy’s Flowers has everything it needs to offer you a completely personalized, full-service shopping experience:

“Welcome to Amy’s Flowers, Bryan! Did you remember that your mother’s birthday is coming up on the 23rd? Since you forgot last year, you might want to splurge a little on the extra-large Bouquet of Magnificence. Don’t worry; we’ll wait to bill you until Friday when your paycheck cashes. Surely a mother’s love is worth at least as much as that new Porsche Boxster?”

By this point, I’m sure you can see both the good side and the bad of the same-origin policy. A world without it would undoubtedly have some amazing web applications, but at an enormous cost to both privacy and security.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset