Chapter 33

Restful

image

33.1 Constraints

  • Interactive: end-to-end between an active agent (e.g. a person) and a backend.
  • Separation between client and server. Communication between the two is synchronous in the form of request–response.
  • Statelessness communication: every request from client to server must contain all the information necessary for the server to serve the request. The server should not store context of ongoing interaction; session state is on the client.
  • Uniform interface: clients and servers handle resources, which have unique identifiers. Resources are operated on with a restrictive interface consisting of creation, modification, retrieval and deletion. The result of a resource request is a hypermedia representation that also drives the application state.

33.2 A Program in this Style

  1 #!/usr/bin/env python
  2 import re, string, sys
  3
  4 with open("../stop_words.txt") as f:
  5 stops = set(f.read().split(",")+list(string.ascii_lowercase))
  6 # The "database"
  7 data = {}
  8
  9 # Internal functions of the "server"-side application
 10 def error_state():
 11 return "Something wrong", ["get", "default", None]
 12
 13 # The "server"-side application handlers
 14 def default_get_handler(args):
 15    rep = "What would you like to do?"
 16    rep += "
1 - Quit" + "
2 - Upload file"
 17    links = {"1" : ["post", "execution", None], "2" : ["get",
      "file_form", None]}
 18    return rep, links
 19
 20 def quit_handler(args):
 21 sys.exit("Goodbye cruel world...")
 22
 23 def upload_get_handler(args):
 24 return "Name of file to upload?", ["post", "file"]
 25
 26 def upload_post_handler(args):
 27 def create_data(filename):
 28    if filename in data:
 29        return
 30    word_freqs = {}
 31    with open(filename) as f:
 32        for w in [x.lower() for x in re.split("[ˆa-zA-Z]+", f.
       read()) if len(x)> 0 and x.lower() not in stops]:
 33        word_freqs[w] = word_freqs.get(w, 0) + 1
 34    word_freqsl = word_freqs.items()
 35    word_freqsl.sort(lambda x, y: cmp(y[1], x[1]))
 36    data[filename] = word_freqsl
 37
 38 if args == None:
 39    return error_state()
 40 filename = args[0]
 41 try:
 42    create_data(filename)
 43 except:
 44    return error_state()
 45 return word_get_handler([filename, 0])
 46
 47 def word_get_handler(args):
 48 def get_word(filename, word_index):
 49    if word_index < len(data[filename]):
 50        return data[filename][word_index]
 51    else:
 52        return ("no more words", 0)
 53
 54 filename = args[0]; word_index = args[1]
 55 word_info = get_word(filename, word_index)
 56 rep = '
#{0}: {1} - {2}'.format(word_index+1, word_info[0],
      word_info[1])
 57 rep += "

What would you like to do next?"
 58 rep += "
1 - Quit" + "
2 - Upload file"
 59 rep += "
3 - See next most-frequently occurring word"
 60 links = {"1" : ["post", "execution", None],
 61     "2" : ["get", "file_form", None],
 62     "3" : ["get", "word", [filename, word_index+1]]}
 63 return rep, links
 64
 65 # Handler registration
 66 handlers = {"post_execution" : quit_handler,
 67        "get_default" : default_get_handler,
 68        "get_file_form" : upload_get_handler,
 69        "post_file" : upload_post_handler,
 70        "get_word" : word_get_handler}
 71
 72 # The "server" core
 73 def handle_request(verb, uri, args):
 74 def handler_key(verb, uri):
 75    return verb + "_" + uri
 76
 77 if handler_key(verb, uri) in handlers:
 78    return handlers[handler_key(verb, uri)](args)
 79 else:
 80    return handlers[handler_key("get", "default")](args)
 81
 82 # A very simple client "browser"
 83 def render_and_get_input(state_representation, links):
 84 print state_representation
 85 sys.stdout.flush()
 86 if type(links) is dict: # many possible next states
 87    input = sys.stdin.readline().strip()
 88    if input in links:
 89        return links[input]
 90    else:
 91        return ["get", "default", None]
 92 elif type(links) is list: # only one possible next state
 93    if links[0] == "post": # get "form" data
 94        input = sys.stdin.readline().strip()
 95        links.append([input]) # add the data at the end
 96        return links
 97    else: # get action, don't get user input
 98        return links
 99 else:
 100    return ["get", "default", None]
 101
 102 request = ["get", "default", None]
 103 while True:
 104 # "server"-side computation
 105 state_representation, links = handle_request(*request)
 106 # "client"-side computation
 107 request = render_and_get_input(state_representation, links)

33.3 Commentary

REST, REpresentational State Transfer, is an architectural style for network-based interactive applications that explains the Web. Its constraints form an interesting set of decisions whose main goals are extensibility, decentralization, interoperability, and independent component development, rather than performance.

When learning about REST, one is invariably led to the Web. Unfortunately, that approach has a few problems that hamper, rather than help, the learning process. First, it is too easy to blur the line between the architectural style (i.e. the model, a set of constraints) and the concrete Web. Second, the examples for REST that use HTTP and Web frameworks require some previous knowledge of the Web – and that's a catch–22.

REST is a style – a set of constraints for writing networked applications. This style is interesting in itself, independent of the fact that it captures the essence of the Web. This chapter focuses on the set of constraints stated by REST by using the same term-frequency example used throughout the book.

On purpose, this chapter doesn't cover the parts of the style that pertain to the network, but it covers the main constraints of REST.

Our example program interacts with the user by presenting them options and acting on the corresponding resources. Here is an excerpt of an interaction:

$ python tf-33.py
 What would you like to do?
 1 - Quit
 2 - Upload file
U> 2
 Name of file to upload?
U> ../pride-and-prejudice.txt
 #1: mr - 786
 What would you like to do next?
 1 - Quit
 2 - Upload file
 3 - See next most-frequently occurring word
U> 3
 #2: elizabeth - 635
 What would you like to do next?
 1 - Quit
 2 - Upload file
 3 - See next most-frequently occurring word

Lines starting with U> denote user input. The words and their frequencies are presented one by one, on demand, by decreasing order of frequency. It's not hard to imagine what this interaction would look like in HTML on a browser.

Let's look at the program, starting at the bottom. Lines #102–107 are the main instructions. The program starts by creating a request (line #102). Requests in our program are lists with three elements: a method name, a resource identifier and additional data from the client (the caller) to the server (the provider) on certain operations. The request created in line #102 invokes the method GET on the default resource, and provides no additional data, because GET operations retrieve, rather than post, data on the server. The program then goes into an infinite ping-pong between the provider-side code and the client-side code.

In line #105, the provider is asked to handle the request.1 As a result, the provider sends back a pair of data that we might call hypermedia:

  • The first element of the pair is the application's state representation – i.e. some representation of the view, in MVC terms.
  • The second element of the pair is a collection of links. These links constitute the set of possible next application states: the only possible next states are those presented to the user via these links, and it's the user who drives which state the application will go next.

On the real Web, this pair is one unit of data in the form of HTML or XML. In our simple example, we want to avoid complicated parsing functions, so we simply split the hypermedia into those separate parts. This is similar to having an alternative form of HTML that would render all the information of the page without any embedded links, and show all the possible links at the bottom of the page.

In line #107, the client takes the hypermedia response from the server, renders it on the user's screen and returns back another request, which embodies an input action from the user.

Having looked at the main interaction loop, let's now look at the provider-side code. Lines #73–80 are the request handler function. That function checks to see if there is a handler registered for the specific request and if so, it calls it; otherwise it calls the get_default handler.

The handlers of the application are registered in a dictionary just above (lines #66–70). Their keys encode the operation (GET or POST) and the resource to be operated on. As per constraint of the style, REST applications operate on resources using a very restrictive API consisting of retrieval (GET), creation (POST), updates (PUT) and removal (DELETE); in our case, we use only GET and POST. Also in our case, our resources consist of:

  • default, when no resource is specified.
  • execution, the program itself, which can be stopped per user's request.
  • file forms, the data to be filled out for uploading a file.
  • files, files.
  • words, words.

Let's look through each one of the request handlers:

  • default_handler (lines #14–18) This handler simply constructs the default view and default links, and returns them (line #18). The default view consists of two menu options – quit and upload a file (lines #15–16). The default links are a dictionary mapping out the possible next states of the application, of which there are two (line #17): if the user chooses option "1" (quit), the next request will be a POST on the execution resource with no data; if she chooses "2" (upload a file), the next request will be a GET on the file_form with no data. This already shines the light on what hypermedia is, and how it encodes the possible next states: the server is sending an encoding of where the application can go next.
  • quit_handler (lines #20–21) This handler stops the program (line #21).
  • upload_get_handler (lines #23–24) This handler returns a "form", which is just a textual question, and only one next possible state, a POST on the file resource. Note that the link, in this case, has only two parts instead of three. At this point, the server doesn't know what the user's answer is; because this is a "form", it will be up to the client to add the user's data to the request.
  • upload_post_handler (lines #26–45) This handler first checks if there is, indeed, an argument given (line #38), returning an error if there isn't (line #39). When there is an argument, it is assumed to be a file name (line #40); the handler tries to create the data from the given file (lines #41–44). The function for creating the data (lines #27–36) is similar to all functions we have seen before for parsing the input file. At the end, the words are stored on the "database", which in this case is just a dictionary in memory mapping file names to words (line #7). The handler ends by calling the word_get_handler for the given file name and word number 0 (line #45). What this means is that after successfully loading the file specificed by the user, the upload function comes back to the user with the "page" associated with getting words in general, requesting the top word on the file that has just been uploaded.
  • word_get_handler (lines #47–63) This handler gets a file name and a word index (line #54)2, it retrieves the word from the given index of the given file from the database (line #55), and constructs the view listing the menu options (line #56–59), which in this case are: quit, upload a file, and get the next word. The links associated with the menu options (lines #60–62) are: POST to the execution for quit, GET the file form for another file upload, and GET the word. This last link is very important and it will be explained next.

In short, request handlers take eventual input data from the client, process the request, construct a view and a collection of links, and send these back to the client.

The link for the next word in line #62 is illustrative of one of the main constraints of REST. Consider the situation above, where the interaction presents a word, say, the 10th most frequently-occurring word on the file, and is meant to be able to continue by showing the next word, word number 11. Who keeps the counter – the provider or the client? In REST, providers are meant to be unaware of the state of the interaction with the clients; the session state is to be passed to the clients. We do that by encoding the index of the next word in the link itself: ["get", "word", [filename, word index+1]]. Once we do that, the provider doesn't need to keep any state regarding past interactions with the client; next time, the client will simply come back with the right index.

The last part of the program is client-side code, a simple textual "browser" defined in lines #83–100. The first thing this browser does is to render the view on the screen (lines #84–85). Next, it interprets the links data structure. We have seen two kinds of link structures: a dictionary, when there are many possible next states, and a simple list, when there is only one next state (we saw this in line #24).

  • When there are many possible states, the browser requests input from the user (line #87), checks whether it corresponds to one of the links (line #88) and, if so, it returns the request embodied in that link (line #89); if the link doesn't exist, it returns the default request (line #91).
  • When there is only one next state, it checks whether the link is a POST (line #93), meaning that the data is a form. In this case, too, it requests input from the user (i.e. the user fills out the form, line #94), then it appends the form data to the next request (line #95) and returns the request in that link. This is the equivalent of an HTTP POST, which appends user-entered data (aka message body) to the request after the request header.

33.4 This Style in Systems Design

The constraints explained above embody a significant portion of the spirit of Web applications. Not all Web applications follow it; but most do.

One point of contention between the model and reality is how to handle application state. REST calls for transferring the state to the client at every request, and using URLs that describe actions on the server without any hidden information. In many cases, that turns out to be impractical – too much information would need to be sent back and forth. In the opposite end of the spectrum, many applications hide a session identifier in cookies, a header which is not part of the resource identifier. The cookies identify the user. The server can then store the state of the user's interaction (on the database, for example), and retrieve/update that state on the server side at every request from the user. This makes the server-side application more complex and potentially less responsive, because it needs to store and manage the state of interaction with every user.

The synchronous, connectionless, request/response constraint is another interesting constraint that goes against the practice in many distributed systems. In REST, servers do not contact clients; they simply respond to clients' requests. This constraint shapes and limits the kinds of applications that are a good fit for this style. For example, real-time multi-user applications are not a good fit for REST, because they require the server to push data to clients. People have been using the Web to build these kinds of applications, using periodic client polls and long polls. Although it can be done, these applications are clearly not a good fit for the style.

The simplicity of the interface between clients and servers – resource identifiers and the handful of operations on them – has been one of the major strengths of the Web. This simplicity has enabled independent development of components, extensibility and interoperability that wouldn't be possible in a system with more complex interfaces.

33.5 Historical Notes

The Web started as an open information management system for physicists to share documents. One of the main goals was for it to be extensible and decentralized. The first Web servers came online in 1991, and a few browsers were developed early on. Since then, the Web has seen an exponential growth, to become the most important software infrastructure on the Internet. The evolution of the Web has been an organic process driven by individuals and corporations, and many hits and misses. With so many commercial interests at stake, keeping the platform open and true to its original goals has sometimes been a challenge. Over the years, several corporations have tried to add features and optimizations that, although good in some respects, would violate the principles of the Web.

By the late 1990s, it was clear that, even though the Web had evolved organically and without a master plan, it had very particular architectural characteristics that made it quite different from any other large-scale net-worked system that had been attempted before. Many felt it was important to formalize what those characteristics were. In 2000, Roy Fielding's doctoral dissertation described the architectural style that underlies the Web, which he called REST – REpresentational State Transfer. REST is a set of constraints for writing applications that explains, to some extent, the Web. The Web diverges from REST in some points, but, for the most part, the model is quite accurate with respect to reality.

33.6 Further Reading

Fielding, R. (2000). Architectural Styles and the Design of Network-based Software Architectures. Doctoral dissertation, University of California, Irvine. Available at http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
Synopsis: Fielding's PhD dissertation explains the REST style of network applications, alternatives to it, and the constraints that it imposes.

33.7 Glossary

Resource: A thing that can be identified.

Universal resource identifier: (URI) A unique, universally accepted, resource identifier.

Universal resource locator: (URL) A URI that encodes the location of the resource as part of its identifier.

33.8 Exercises

33.1 Another language. Implement the example program in another language, but preserve the style.

33.2 Upwards. The example program traverses the list of words always in the same direction: in decreasing order of frequency. Add an option in the user interaction that allows the user to see the previous word.

33.3 A different task. Write one of the tasks proposed in the Prologue using this style.

1In Python, *a unpacks a list a into positional arguments.

2On the Web, this would look like http://tf.com/word?file=...&index=...

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset