Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

11
Designing an API in context

This chapter covers

Adapting communications to goals and data
Considering needs and limitations of consumers and provider
Choosing an API style based on context

In the previous chapter, we started to discover that the APIs we were designing were created ignoring most of the context in which they exist. We explored the network context and how it can impact the design of APIs. But there are other contextual elements to consider in order to design APIs that will actually fulfill all your consumers' needs and also be implementable. As we’ve seen, designing APIs requires us to focus on the consumers first, but it also requires us to keep an eye on the provider’s side.

Do you know how the QWERTY keyboard layout was invented at the end of the 19th century? The most common story is that it was created to solve a mechanical problem. On a typewriter, letters are mounted on metal arms that can clash and jam if two neighboring keys are pressed at the same time or in rapid succession. To avoid this mechanical problem and allow users to type faster, commonly used letter pairs were placed far away from each other. This story, if true, means that the QWERTY design was influenced by internal concerns. But according to Koichi Yasuoka and Motoko Yasuoka from Kyoto University¹

¹ Koichi Yasuoka and Motoko Yasuoka, “On the Prehistory of QWERTY,” Kyoto University, March 2011 (https://doi.org/10.14989/139379).

“The early keyboard of Type-Writer was derived from Hughes-Phelps Printing Telegraph, and it was developed for Morse receivers. The keyboard arrangement very often changed during the development, and accidentally grew into QWERTY among the different requirements. QWERTY was adopted by Teletype in the 1910’s, and Teletype was widely used as a computer terminal later.”

Koichi Yasuoka and Motoko Yasuoka

According to this research, the design was in fact influenced by the context in which typewriters were used. Regardless of its origin, the funny thing is that this relic of the past is still widely used today. I have inspected my smartphone and did not find any metal arms behind its touchscreen; but in most countries, Latin or Roman alphabetic digital keypads still use the QWERTY layout or their local version, like AZERTY in France, for example. Even if does not make sense anymore, people are used to it, and the few who dared to try to change their habits were not really successful. So how objects are built and how they work, how they are used and what their users are used to, can influence their design, and the same goes for APIs, as shown in figure 11.1.

Figure 11.1 Provider and consumer contexts influence API design.

While most developers might be used to consuming JSON-based APIs, taking full advantage of the HTTP protocol, there are dark corners of the software industry where XML still rules, and POST is the only possible HTTP method. The banking industry is used to ISO 20022 standard messages, which could be considered complex and not user-friendly, but trying to provide APIs supporting other simpler formats to banking companies can cause more problems than the ones these formats are supposed to solve.

Context impacting design is not reserved to consumer contexts, either. The provider’s context can also influence design, even if API designers do everything they can to hide the provider’s perspective (see section 2.4). Representing a goal involving human controls (such as some cases of international money transfers) with a synchronous request/response mechanism might not be the best option. That is why when we design APIs, we must choose the best way to communicate, taking into account both consumers’ and providers’ potential limitations and even considering other styles of APIs beyond REST . If we do not do that, the APIs we design might not be fully usable or implementable.

11.1 Adapting communication to the goals and nature of the data

So far, we have been talking about synchronous web APIs that allow consumers to send requests to providers and get responses immediately. But depending on the nature of an API’s goals and data, a unitary and synchronous request/response-based mechanism might not be the most efficient representation. You might have to deal with long processing times, send events to consumers, or process multiple elements in one shot. As an API designer, you must have tools other than synchronous request/response in your toolbox to deal with such cases.

11.1.1 Managing long processes

A synchronous request/response mechanism is not always the best option to represent a goal. Sometimes you might have to provide asynchronous goals. For example, the Banking API provides a transfer money goal that allows both national and international money transfers. But according to banking regulations, depending on which country and bank the target account is located in and the transfer amount, some documents might have to be provided in order to explain the nature of the transaction. Therefore, the consumer (the Awesome Banking App, for example) must provide the source and destination for each transfer (see section 5.3). To determine the valid sources and destinations, it uses the aggregated list sources and destinations goal.

The data returned by this goal not only describes all possible source and destination combinations and their minimum and maximum amounts, but also indicates in which cases documentation justifying the transaction must be provided. If documentation is required, the consumer can use the upload transfer document goal to send that and get a reference. Figure 11.2 shows what happens afterwards: validation by a human.

Figure 11.2 A money transfer requiring human validation

Once the document is uploaded, the consumer can use the transfer money goal, indicating the source account, the destination account, the amount, and the document reference. Unfortunately, the money transfer cannot be triggered immediately because the provided document has to be validated by a human being. So in this case, the money transfer response status is 202 Accepted (instead of 201 Created, which would be the response if no validation was required). This means the money transfer request has been accepted but will be processed later.

The returned data indicates the current status of the money transfer (PENDING), the transfer’s ID (T123), and the "self" URL in _links. The consumer can later use the read transfer goal using the provided ID or self URL to check the transfer’s current status. This status can be either PENDING (if no action has yet been performed), VALIDATED (if the document has been validated by a human being, but the transfer has not yet been performed), or EXECUTED (if the money transfer has been completed). Note that when accessing the transfer’s status using a GET /transfers/T123 HTTP request, cache directives (see section 10.2.2) can provide some hints about when is it wise to retry this call to get updated information.

As you can see, depending on the nature of the goal, what actually happens from a functional perspective using a synchronous request/response mechanism might not be possible. Here, it would mean consumers waiting for several minutes (or even hours, if not days) to get a response, which obviously is unthinkable. In such cases, the API has to provide a goal to receive the request, which can take quite a long time to process, and then a way to get the status of this request’s processing later. Providing information about when to make another request by taking advantage of protocol features or by simply returning data benefits both consumer and provider by avoiding unnecessary calls.

11.1.2 Notifying consumers of events

Consumer-to-provider communication is not always the most efficient way of communicating. Indeed, sometimes it can be useful to let the provider take the initiative.

In the previous section, we saw that consumers might have to make repeated API calls to ask, “Is this money transfer done?” Such behavior is called polling, and it can be quite annoying for both consumers and providers: many unnecessary calls can be made. It would be great if the Banking API could instead tell its consumers when a money transfer is actually done.

Reversing the consumer/provider communication can be done using a webhook, which is often described as a “reverse API.” Figure 11.3 shows how such a mechanism could be used with the Awesome Banking App to notify the consumer of an executed money transfer.

Figure 11.3 Using a webhook to notify the consumer of the execution of a money transfer

As before, the Awesome Banking App calls the Banking API to request a money transfer that requires (human) validation (1). The Banking API again responds with a 202 Accepted status to indicate that the request has been accepted and will be processed later. Now the mobile application does not have to poll (regularly make calls to) the Banking API to get the transfer’s status. Instead, once the money transfer has actually been executed, the Banking API (or more probably another module managed inside the Banking Company’s systems) sends a POST request to the Awesome Banking App’s webhook URL, https://awesome-banking.com/events (2). The request’s body contains some data about the event that occurred, like the ID of the user who initiated the money transfer, the transfer ID, the event’s status, and the transfer’s "self" link, for example.

When the Awesome Banking App backend implementing the webhook receives this event, it can look for the mobile phone identifier corresponding to the user and send a notification using the iOS or Android notification system to the mobile application to signify that money transfer T123 has been executed (3). Finally, the mobile application can use the read transfer goal to get further information that was not included in the event or notification (4).

Such a mechanism is not restricted to an asynchronous communication initiated by the consumer. It can also be used to notify consumers of events that are generated without any consumer interaction. For example, events could be sent when new transactions occur on a bank account. The Awesome Banking App could take advantage of this for its dashboard, owner, and account screens (see section 10.2). It could also rely on cached data as long as no such event is sent.

More specific and custom events could be sent too. For example, the Banking API could provide an alerting system that sends events based on transactions or balance data. Using such a feature, the Awesome Banking App could allow its users to configure alerts like, “Let me know when my account balance is below $200” or “Let me know when a card payment above $120 is made.” The Banking API would send these alert events through the webhook only for users who have configured those.

This looks great, but how does the Banking Company, the provider of the Banking API, know the Awesome Banking App’s webhook URL and its interface? In section 8.1, you saw how consumers have to register to be able to consume an API and how they are identified when they send a request. When registering the Awesome Banking App on the Banking API developer portal, its developer team indicated its webhook URL, which can be used to notify the consumer of events.

This webhook is an API that is implemented by the Awesome Banking App team, but its interface contract and behavior are defined by the Banking API team in order to ensure that all consumers expose the same webhook API. It would obviously be a nightmare for the Banking Company to let each consumer design its own webhook interface contract as it would have to code specifically the webhook calls for each of its consumer.

Like any APIs, you have to design webhook APIs to hide the provider’s perspective and make them usable and evolvable. Depending on your needs, a single webhook might receive all possible events, or there might be multiple webhooks: one for each event type. Each event might provide a little data or a lot. It will be up to you to decide what’s appropriate.

Having a single webhook that receives lightweight, generic events is usually a good strategy. Such a webhook API is quite simple to implement and to consume, and adding new events is easy. You should always decide on what design to use according to your context.

There is another important characteristic that must not be overlooked when dealing with webhook APIs—security. A webhook can be exposed on the internet, and some malicious people might try to send false events in order to hack the provider’s systems. That’s one of the reasons why using lightweight, generic events is a good option; consumers have to call the provider to get detailed information.

It’s crucial that the access to the webhook API be secured in order to ensure that only the API provider can actually use it. Securing a webhook can be done using various techniques, such as provider IP address whitelisting (bear in mind that such whitelists might be hard to maintain), sending a secret token when posting to the webhook, encrypting and signing the request, using mutual TLS, and so on.

As you saw in chapter 8, API designers don’t have much to say about the technical side of API security, but they heavily contribute from a functional perspective. You have to ensure that events do not contain sensitive data and that the data provided allows consumers to react securely. For example, if an event concerns a specific user, an API’s consumers must be able to identify that user through the event’s data. Otherwise, a user can get undue access to other users' data. This again promotes the use of lightweight events to limit the damage that can be done if this should occur.

WebSub

There is no webhook standard. Although you can design your API as you wish, the W3C has issued a WebSub recommendation (https://www.w3.org/TR/websub/) that you can take advantage of when building webhook-based systems:

“WebSub provides a common mechanism for communication between publishers of any kind of Web content and their subscribers, based on HTTP web hooks. Subscription requests are relayed through hubs, which validate and verify the request. Hubs then distribute new and updated content to subscribers when it becomes available. WebSub was previously known as PubSubHubbub.”

W3C WebSub recommendation

Basically, the WebSub recommendation describes how an API provider (the publisher) can expose its event capabilities and let consumers (the subscribers) register for and receive events, all in a secure way. Being inspired by this recommendation, the Banking API could provide a standard API to let consumers register for events like the alerts mechanism discussed earlier in this section.

Webhooks basically are APIs implemented by API consumers but defined and used by API providers to send notifications of events. These events can be triggered by consumer or provider actions. This is not the only way of implementing notifications, but with this model, providers can notify consumers about events when they happen and don’t have to wait for the consumers to make API calls themselves.

11.1.3 Streaming event flows

When an API provides data that always changes to consumers using a basic request/response goal, you can be sure that they will poll it continuously, making repeated API calls in order to get new or updated data. Suppose the Banking API provided data about stocks for trading account portfolios. There are different options for doing so, as shown in figure 11.4.

Figure 11.4 How the Banking API could provide stock information

The Banking API could offer a read stock goal that provides detailed information about a specific stock and its price (1). Consumers wanting to always have the latest stock price might call this goal in a loop (once they get the new data, they trigger another call, endlessly).

In section 10.2.2, you discovered caching and conditional requests (2). Could these be of any help? Unfortunately, those would be useless, at least when the stock exchanges are open, because the stock prices can change every second (if not more frequently). The cache’s time-to-live would be so short that caching would be ineffective and conditional requests would always return updated data. This data is so volatile that even using polling, consumers might not be aware of all price variations as the price could vary between calls.

In section 10.3.7, you saw that sometimes we have to check if a goal really fulfills consumers' needs. Maybe this goal is not the right one. What about adapting the API’s design and providing a list stock prices goal returning the n latest price variations and offering cursor-based pagination (3)? Consumers could indicate the last price ID they received and, in that way, be sure to not miss any price variations. That could be an interesting option if consumers are willing to get not-quite-real-time data, but consumers would still poll this goal endlessly.

A change in stock price looks like an event that consumers could be notified of. So what about a webhook (4)? Because the provider knows when a stock price changes, it can post that event to consumers' webhooks as soon as it occurs. But that means always sending price variations of all stocks to all consumers.

Using a WebSub system or a custom WebSub–like one, as we discussed briefly in the previous section, consumers could subscribe to a few stock price variation feeds instead of receiving notifications about all of them. They would still get this data all the time, though.

But what about consumers that want to show real-time data for only a small period of time, while their end users are looking at their portfolios or at a specific stock, for example? They would have to find a way to forward these event flows. How could an API server (like the Banking API’s) send a stream of events requested by a consumer (5)? Figure 11.5 contrasts a basic request/response API call and a Server-Sent Events (SSE) stream that can be used in such cases.

Figure 11.5 Streaming events to consumers with HTTP SSE

At the top of the figure, the consumer requests the latest prices of the APL stock as an application/json document using a GET /stocks/APL/prices request with the appropriate Accept header. By default, the server returns the list of prices for the last five minutes. The document has an items attribute containing the list of prices. To get more recent data, the consumer will have to make another request using cursor-based pagination.

The bottom of the figure shows how all this could be handled with an SSE stream. The request is almost the same, but the consumer now indicates that it wants the data as a text/event-stream document. The server responds with a 200 OK success status and a document whose content type is text/event-stream as requested. Each price event is represented by a line starting with data:, and it contains the same data as provided previously in the finished list.

The huge difference in this approach is that now the price events are provided as a stream; the returned document is not a static and finished one anymore. The server adds a data: line for each new price event occurring for the APL stock. It will go on doing this until there are no more events or until the consumer closes the connection. Using SSE, a server can send event data to consumers.

Regarding the design of the event data, you’ll recall that in the webhook use case, I recommended that you put the least possible amount of data in events and that consumers get additional data with another regular call to the API. But in such a streaming use case, regardless of the technology used (SSE or something else), it is usually better to provide as much data as possible because consumers will want all the data without having to make another independent call to the API. As always, though, this is not mandatory; it can depend on the context. Everything else you have learned in this book applies too: events and their data must make sense for consumers and must be easy to understand and use, to evolve, and to secure.

Note that using content negotiation and providing both application/json and text/event-stream media types is not required; the Banking API could only provide the streaming version. It is also not mandatory to use the same path to provide these two different representations. The Banking API could use different paths such as /stocks/{stockId}/prices and /stocks/{stockId}/price-events. The API could also provide a way of getting price events for multiple stocks with a request like GET /stock-prices?stockIds=APL,APA,CTA. In the response to this request, each event sent via a data: line will concern one of the APL, APA, or CTA stocks; the consumer will be able to tell which one by checking the stockId property value.

Although the SSE specification provides more features than just the data: lines, it’s quite simple. The following listing shows the various possibilities.

Listing 11.1 The complete SSE specification²

² “W3C Working Draft,” April 2009, Eds. Ian Hickson, Google, Inc. (https://www.w3.org/TR/2009/WD-eventsource-20090421/#event-stream-interpretation).

: this is a comment    ①  
 
data: this is text data    ②  
    ③  
data: {"json": "data"}    ④  
 
data: this is multi-    ⑤  
data: line data
 
id: optional event ID    ⑥  
event: optional event type
data: event data
 
retry: 10000    ⑦

① The stream can be commented with a line starting with a colon (:).

② Each line starting with data: is an event; an event data line basically contains text.

③ A blank line separates each event.

④ Because data is text, you can use JSON or XML.

⑤ Multiple data: lines can be used for multiline data.

⑥ Each event can be completed with an optional ID and event type.

⑦ The retry interval tells the consumer not to reconnect before 10000 ms (10 s) have passed if the connection is lost.

There are a few other things to know about SSE:

It relies on the HTTP protocol but is not part of it; it was created as a standard for HTML5 by the W3C.
It is quite simple to use for browser-based consumers because it was designed for them, but there are libraries available for almost any language.
The event data can only be text (simple text, JSON, XML, and so on). If you need to send binary data, like images, you have to encode that in text.
An SSE stream can take advantage of HTTP compression. It is a unidirectional stream, which means that once the connection is established, the consumer cannot send data to the server using this connection.

Because it relies on the HTTP protocol, no specific infrastructure is required to host an API using this technology, but be warned: using SSE means that HTTP connections remain open for quite a long time. Therefore, the infrastructure hosting the API has to be tuned to support long parallel connections. Even so, it might be useful to use a single SSE stream to send different types of events. To do so, you can take advantage of the event property.

Now, suppose the Banking Company wants to provide some chat features to allow end users to discuss their accounts with humans or bots. In this case, it might be preferable to provide for bidirectional communication, allowing both the consumer and provider to send events. Unfortunately, SEE only allows unidirectional communication from server to consumer; but thankfully, there are other solutions. Such a need is usually met using the WebSocket protocol as defined by RFC 6455 (https://tools.ietf.org/html/rfc6455), which is widely adopted for chats and games. We will not go into detail on the infrastructure, but know that this approach requires more work on the infrastructure side than the HTTP-based SSE stream.

A WebSocket relies on a raw TCP connection, which might not be allowed to pass through corporate proxies without modifying their configuration. Regarding the messages that could be exchanged with this protocol, it is up to you, the API designer, to do your job without relying on any standard. But remember that you can copy what others have done.

Most WebSocket APIs rely on typed messages as with SSE, except that in this case, both the consumer and the provider can send messages. If you need to link an event request sent through the WebSocket to its event response, you just need to add some unique identifier to the messages.

There are different ways of streaming events. The important thing for an API designer is to know that a request/response mechanism is not the only option. When dealing with high-volatility data and real-time data, streaming events not only from provider to consumer but also from consumer to provider is an option that should be considered.

11.1.4 Processing multiple elements

The various API examples you have seen so far have provided two ways of reading data. Some goals can provide access to single elements and others to multiple ones. For instance, the Banking API allows consumers to read a single account with the read account goal or multiple ones with the list accounts goal. But when it comes to creation-, modification-, or deletion-related goals, we have only seen goals that work on a single element. Depending on the elements being manipulated and the context, it might be useful to be able to process multiple elements with a single API call instead of having to make many API calls, each processing a single element at a time.

To explore this topic, let’s add some more personal financial management features to the Banking API. We can let consumers modify transactions to define personalized categories, add comments about them, and also check them; checking a transaction is similar to marking an email as read. To do so, we’ll add an update transaction goal represented by a PATCH /transactions/{transactionId} request. The following listing shows the JSON schema of the expected body.

Listing 11.2 The JSON schema of the update transaction goal’s body

openapi: "3.0.0"
...
components:
  schemas:
    ...
    UpdateTransactionRequest:
      description: |
        At least one of the comment, customCategory, or checked
        properties must be provided
      properties:    ①  
        comment:
          type: string
          example: My new Ibanez electric guitar
        customCategory:
          type: string
          example: Music Gear
        checked:
          type: boolean
          description: |
            Checking a transaction is similar to marking an email as read.
            True if the transaction has been checked, false otherwise.

① The body is composed of three properties: comment, customCategory, and checked.

The comment, customCategory, and checked properties are all optional; consumers can update one, two, or all of them. The transaction’s ID is also not needed in the body because it is provided in the /transactions/{transactionId} resource’s path. No other properties of a transaction, like its amount or date, can be updated in this case.

Providing a ‘Mark all as checked’ or a ‘Mark all selected as read’ feature in the Awesome Banking App requires us to update each transaction. Depending on the transaction count, having to make an API call to update each transaction might be a problem. As we saw when reading data (section 10.3), we could try to aggregate all these unitary calls into a single one. That is, we could allow consumers to update multiple transactions in a single API call by proposing an update transactions (with an s) goal. As shown in figure 11.6, such a goal could be represented by a PATCH /transactions request. Listing 11.3 shows the JSON schema of its body.

Figure 11.6 Checking multiple transactions in one call

Listing 11.3 The JSON schema of the update transactions goal’s body

openapi: "3.0.0"
...
components:
  schemas:
    ...
    UpdateTransactionsRequest:
      properties:
        required:
          - items
        items:
          type: array
          minItems: 1
          maxItems: 100    ①  
          items:
            allOf:    ②  
              - required:
                  - id
                properties:
                  id:
                    type: string
                    description: Transaction ID
              - $ref: "#/components/schemas/UpdateTransactionRequest"

① No more than 100 updates at a time

② Same data as unitary call plus the transaction ID; allOf aggregates provide JSON schemas

The updated data for all transactions is provided as an object containing an items property, which is a list of 1 to 100 transactions. This is the same kind of representation used in the response body of a list something goal, such as list transactions. The properties provided for each transaction are the same as the ones provided for the unitary goal (comment, customCategory, and checked), plus the id because the transaction ID cannot be in the resource path. To check multiple transactions in one call, consumers need to provide id and checked properties for each checked transaction.

That’s for the request, but what about the response? When updating a single transaction, the update transaction goal can signify that the update has been done with a 200 OK HTTP status, that there was something wrong with the request with a 400 Bad Request status, or that the transaction ID is unknown with a 404 Not Found. When processing multiple transactions simultaneously, if all the transactions are successfully updated, a 200 OK status could be returned.

The same goes in the case of an error: a 400 Bad Request response could be returned even if the problem is an invalid transaction ID. A 404 status code can only be returned if the resource’s path is unknown, which would not be the case here. This is slightly different from the unitary update. And what if some transactions can be updated and some cannot? Should the Banking API implementation stop at the first error and return a 400 response without processing any valid transaction updates?

If you remember our discussion in section 5.2.4, you know that the answer to this question is no because this would make the API less usable. Consumers would have to do many calls to fix each error one by one (and an API not processing the valid updates could be quite infuriating). The update transactions goal must return all errors, process all valid transaction updates, and also indicate which updates were successfully done. That means returning multiple statuses; fortunately, there is an HTTP code for that:

“The 207 (Multi-Status) status code provides status for multiple independent operations….”

WebDAV

The 207 status is defined in RFC 4918, which allows clients to perform remote web content authoring operations.³ It provides new methods, headers, media types, and statuses to facilitate resource management and especially to manipulate multiple resources in one call—thanks to this 207 status. The following listing shows an example of what a WebDAV server should return, according to RFC 4918, in a 207 response when deleting multiple resources.

³ “HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV),” L. Dusseault, Ed., June 2007 (https://tools.ietf.org/html/rfc4918).

Listing 11.4 A 207 Multi-Status response as described in RFC 4918

<?xml version="1.0" encoding="utf-8" ?>
<d:multistatus xmlns:d="DAV:">
  <d:response>
    <d:href>http://www.example.com/container/resource3</d:href>
    <d:status>HTTP/1.1 423 Locked</d:status>
    <d:error><d:lock-token-submitted/></d:error>
  </d:response>
  <d:response>
    <d:href>http://www.example.com/container/resource4</d:href>
    <d:status>HTTP/1.1 200 OK</d:status>
  </d:response>
</d:multistatus>

This is an XML document containing a list, with each element composed of an href (the URL of the processed resource), a status (the unitary HTTP status), and an optional error message. Note the compression and encoding handled at the upper level (the response sent by the API server); each response must use the same encoding and compression.

The RFC 4918 describes various 207 responses for specific HTTP methods (such as PROPPATCH and PROPFIND), but these are XML-based—too specific to the WebDAV context—and cannot be reused in other contexts. That is why I chose to just keep the 207 status and define my own (JSON) format for request and response bodies for the update transactions goal, as shown in figure 11.7.

Figure 11.7 Contrasting multiple responses to the update transaction goal with a single response to the update transactions goal

A 207 Multi-Status response to an update transactions request is an object with an items property, which is a list containing as many elements as are in the items list provided in the request. The response list is ordered exactly like the request one: the response to the third request is in the third position in the response list. For each element, consumers get exactly the same information they would have gotten making a unitary call. Here, that means a status and a body containing the HTTP status and the response body for each transaction update attempt.

The first status in the list is a success status (200 OK), and its body contains the updated resource. The last two requests were not processed because of a comment that was too long and an unknown transaction ID. For each of these, the body contains the error data structure (seen in section 5.2.4) that would have been returned for a unitary call. If the consumer sends an invalid list in its request (with more than 100 elements, for example), the status will be 400 Bad Request. Also, if headers are usually returned for unitary calls, we could add a headers map for each element. Listing 11.5 shows the complete JSON Schema, and listing 11.6 shows an example.

Listing 11.5 The multi-status response’s JSON Schema

openapi: "3.0.0"
...
components:
  schemas:
    MultipleStatusResponse:
      required:
        - items
      properties:
        items:    ①  
          type: array
          minItems: 1
          maxItems: 100
          items:
            required:
              - status
            properties:
              status:    ②  
                type: string
                description: HTTP status
                example: 404 Not Found
              headers:
                additionalProperties:    ③  
                  type: string
                description: HTTP headers map
                example:
                  My-Custom-Header: CUSTOM_VALUE
                  Another-Custom-Header: ANOTHER_CUSTOM_VALUE
              body:
                description: |
                  Transaction if status is 200 OK, Error otherwise
                oneOf:    ④  
                 - $ref: "#/components/schemas/Error"
                 - $ref: "#/components/schemas/Transaction"
                example:
                  message: Transaction T135 not found

① Contains one element for each transaction of the request

② The HTTP status

③ A <string, string> map for the headers (it could also be a name, value list).

④ The body—a Transaction if status is 200 OK or an Error otherwise (one of the provided JSON Schemas)

Listing 11.6 An example generated using the JSON Schema

{
  "items": [
    {
      "status": "404 Not Found",
      "headers": {
        "My-Custom-Header": "CUSTOM_VALUE",
        "Another-Custom-Header": "ANOTHER_CUSTOM_VALUE"
      },
      "body": {
        "message": "Transaction T135 not found"
      }
    }
  ]
}

Remember that what is shown in figure 11.7 and listings 11.5 and 11.6 is my own interpretation of what the content of a 207 Multi-Status response might look like. We can use the same design to replace or delete multiple resources with a PUT /resources or a DELETE /resources?ids=1,2,5,6,9 request. To create multiple resources at a time, there are a few things to consider.

We could use a POST /resources request, but what if we also want consumers to be able to create a single resource at a time? As long as a create resources goal can create one or more resources, a consumer could pass a single resource in the list. We could also accept a list of resources and a single resource in the request body (you should try to describe such an operation, its request body, and the various responses using the OpenAPI Specification as described in chapter 4). But what if we want to make a clear separation for security concerns, for example, between the create resource and create resources goals using different paths?

It is not uncommon to see POST /resources/batch requests to create multiple resources in one call; such paths break the /collection/{resourceId} pattern, but at least consumers will understand at first sight what they can do. Depending on how security is handled, providing two different paths might be unavoidable. In an ideal world, however, I would prefer to provide a single POST /resources path, accepting a list of resources or a single resource, with consumers having the batch resource creation scope only being allowed to send requests containing a list of resources.

Be warned that the partial processing strategy (processing valid items even if the provided list contains invalid ones) discussed in this section might not be the one to choose in all cases. There are some cases where processing only a portion of the provided items can cause problems. So before introducing such behavior, always check the consequences of such partial processing. If partial processing does not make sense, the API can return a more classical 200 OK on success and 400 Bad Request, for example, if the request is invalid.

As you can see, APIs are under no obligation to provide only ways to process single resources; there are contexts in which processing multiple resources in a single call can be useful. Whatever the solution you design, remember that consumers must get the same data, including protocol data like headers or status codes for HTTP and errors that they would have gotten for unitary requests. They must be able to make the connection between each element of their request and each element of the API’s response. And do not forget to handle global controls and errors; for example, limiting the number of elements that can be provided in the request.

11.2 Observing the full context

You saw in section 10.1 that designing APIs requires us to think about how the APIs will actually be used by consumers, mostly for the consumers' sake but also for the provider’s. And now we have discovered that it also requires us to care about the true nature of goals or data in order to provide efficient, usable, and also implementable APIs (see section 11.1). All this means is that designing APIs requires more than just focusing on consumer needs and avoiding the provider’s perspective. Designing APIs requires us to fully observe the context in which these will be consumed and provided in order to ensure that these fulfill all consumers' needs in the best possible way—and actually be implementable by providers.

11.2.1 Being aware of consumers' existing practices and limitations

Fulfilling all consumers' needs means designing APIs that provide all the needed goals in an easy-to-understand and easy-to-use way; it also means being careful about some aspects that could be called nonfunctional requirements. These nonfunctional requirements basically concern how the API goals and data will actually be represented. Consumers can be used to certain practices or have some limitations that must be taken into account when designing APIs.

You saw in section 5.1 and in sections 6.1.3 and 6.1.4 that APIs designed using simple representations and standards, and following common practices, are easier to understand, easier to use, and more interoperable. But this can go far beyond just using crystal-clear names and standard date formats or applying commonly used path patterns. Existing practices can have a deeper impact on API design.

For example, suppose the Banking Company wants to provide a bank details verification API that confirms if an account number actually exists at any bank and belongs to a given person. Such a service could be useful to companies using direct debit for payments. To be paid, companies withdraw funds from their customer’s bank account. When doing so, companies would be glad to be sure that the provided information actually matches an existing bank account belonging to their customer before selling them any goods or products.

Such an API seems quite simple to design. It proposes a single verify bank details goal. This goal expects an account number in IBAN format and the account owner’s first name and last name. It returns a simple OK feedback in the case of success and provides detailed information in the case of an error; for example, if the account number exists but the owner name does not exactly match because of a typo. Based on what you have learned, how would you represent such a goal? Figure 11.8 shows three ways of doing so.

Figure 11.8 Adapting design to what consumers are used to

Based on what you have learned in section 8.4, you know that it is not a good idea to represent this bank details verification goal with a GET /bank-details request with firstName, lastName and iban query parameters (1). Indeed, IBANs (account numbers) and first and last names as sensitive data cannot be passed as query parameters—they could be logged anywhere! So you would probably represent this goal with a POST /bank-details-verification request (2), its body being a JSON object containing iban, firstName, and lastName mandatory properties. If the request is a valid one, it can return a 200 OK response with its body containing the status of the verification, indicating if the provided bank details are valid or not. If the request is invalid (for example, if an IBAN with an invalid format has been provided or a lastName property has not been provided), a 400 Bad Request response containing a JSON object with an informative message and details about the problem(s) encountered can be returned, as you saw in section 5.2.4.

Such an API seems easy to understand and easy to use by anyone. But before designing this API, we did not check the actual consumers' context; and in this case, this is a critical mistake. The targeted consumers are the Banking Company’s corporate consumers, who will consume the API using financial COTS (commercial off-the-shelf) software . The people working with such financial software (and the software itself) are not used to custom JSON data; rather, they are used to standard ISO 20022 financial XML messages (3). Let’s take a closer look at this third design option: both request and response are based on ISO 20022 financial XML messages as shown in listings 11.7 and 11.8.

Listing 11.7 An ISO 20022 IdentificationVerificationRequestV02 XML message

<?xml version="1.0" encoding="utf-8"?>
<Document>
  <IdVrfctnReq>
    <Assgnmt>
      <MsgId>MSGID_001</MsgId>
      <CreDtTm>2012-12-13T12:12:12</CreDtTm>
    </Assgnmt>
    <Vrfctn>
      <Id>VRFID_001</Id>
      <PtyAndAcctId>
        <Pty>
          <Nm>Spike Spiegel</Nm>    ①  
        </Pty>
        <Acct>
          <IBAN>JPXX098367887987098</IBAN>    ②  
        </Acct>
      </PtyAndAcctId>
    </Vrfctn>
  </IdVrfctnReq>
</Document>

① Account holder’s first name and last name

② Account’s IBAN

Listing 11.8 An ISO 20022 IdentificationVerificationReportV02 XML message

<?xml version="1.0" encoding="utf-8"?>
<Document>
  <IdVrfctnRpt>
    <Assgnmt>
      <MsgId>MSGID_001</MsgId>
      <CreDtTm>2012-12-13T12:12:12</CreDtTm>
    </Assgnmt>
    <Rpt>
      <OrgnlId>VRFID_001</OrgnlId>
      <Vrfctn>true</Vrfctn>    ①  
      <OrgnlPtyAndAcctId>    ②  
        <Pty>
          <Nm>Spike Spiegel</Nm>
        </Pty>
        <Acct>
          <IBAN>JPXX098367887987098</IBAN>
        </Acct>
      </OrgnlPtyAndAcctId>
    </Rpt>
  </IdVrfctnRpt>
</Document>

① Verification status

② Verified information (name and IBAN)

The IdentificationVerificationRequestV02 message shown in listing 11.7 is a standard bank details verification request. It contains an IBAN in the Document.IdVrfctnReq.Vrfctn.PtyAndAcctId.Acct.IBAN property and the owner’s first name and last name in Document.Vrfctn.PtyAndAcctId.Pty.Nm. There is also a request ID and date and a verification ID.

The IdentificationVerificationReportV02 message shown in listing 11.8 is the standard response to such a request. It contains the original request data and a Boolean flag that is true if the verification succeeds and false otherwise (Document.IdVrfctnRpt.Rpt.Vrfctn).

As the ISO 20022 standard only describes messages, not how they are transmitted, we can at least keep the spirit of the second version of the API. The verify bank details goal could still be represented by a POST /bank-details-verifications request, but now its body is an IdentificationVerificationRequestV02 ISO 20022 XML message. The response could still be 200 OK if the IdentificationVerificationRequestV02 input message is a valid one, but now the response body is an IdentificationVerificationReportV02 message. If the request is invalid, a 400 Bad Request response is returned along with a custom XML message mapping to the JSON error message that we are used to (the ISO 20022 standard does not describe how such errors are to be handled).

The resulting API design is not that bad, but according to what you have learned, especially in section 5.1, such ISO 20022 XML messages could be considered complex (and we could also consider that XML is not really trendy anymore). But in this context, the targeted consumers natively speak using ISO 20022 XML messages; and therefore, the API must use them. Consuming the XML API within the financial COTS software used by the targeted consumers would be quite easy. If the API used custom JSON messages, consuming the API might require more work on the consumer side and, in some cases, might not be possible. But when in Rome, do as the Romans do.

Choosing a suitable representation is not about choosing what we as API designers are used to or what we might consider good design or fashionable; it is about choosing what is appropriate in the desired context. Always check if the targeted domain or consumers have specific practices that you should follow in your API design. Such practices could be the use of standards or the way they represent data, name things, manage errors, or anything else.

That takes care of the existing practices that might influence the design of APIs, but what about the limitations? Let’s say that the Banking Company also wants to target noncorporate/nonfinancial consumers, who are definitely not used to ISO 20022 XML and are more used to simple JSON. A smart API design could take advantage of content negotiation (see section 6.2.1) to handle that. Consumers would just have to set the Content and Accept headers to application/xml when they want to use the ISO 20022 standard and application/json to use the simple JSON format. That’s great; the API is adaptable enough to fulfill the needs of two different types of consumers.

But unfortunately, after interviewing some of the developers of the financial COTS systems used by the targeted customers, it seems that most of them cannot handle content negotiation easily. It would be wiser, then, to consider the API’s default format to be XML, or perhaps to allow consumers to specify which format they want to use when registering on the developer portal. Not being able to pass a simple header seems quite ridiculous, but that can happen.

Don’t take it for granted that all consumers can do what you are used to. Consumers might have technical limitations, like the COTS software being unable to add headers to an HTTP request, but there are many other possibilities. Some consumers might not be able to use any HTTP method other than GET or POST. You saw in section 10.1 how mobile applications can be limited by network capabilities. And in section 11.1, we talked about webhooks. Not all consumers can implement those easily.

In order to avoid discovering too late existing practices or limitations that go against what you are used to, you will have to show empathy toward your targeted consumers. Don’t hesitate to talk to them, question them, discuss your designs with them—you won’t regret it.

Of course, as discussed in section 10.3.8, all this must not be done at the expense of usability and reusability. Do not try to please a few consumers with highly specific needs in a single API. Instead, consider creating different API layers or letting consumers create their own backend for frontend APIs.

11.2.2 Carefully considering the provider’s limitations

In section 2.4, you learned to design APIs while avoiding exposing purely internal concerns to consumers. But avoiding exposing the provider’s perspective does not mean wearing blinders and totally ignoring it. Indeed, when designing an API, we have to take into consideration what is happening behind the API in order to propose a design that will actually be not only usable, but also implementable. Figure 11.9 shows some examples.

Figure 11.9 Provider’s limitation examples

If the Banking Company wants to provide trading-related goals like buy stocks or sell stocks, its API designers must be aware that stock exchanges are not always open in order to create an adequate design. A consumer trying to buy stocks on a closed stock exchange should get an error telling them that the operation is not possible at the moment. Such an error could be represented by a 503 Service Unavailable HTTP status code. As you learned in section 5.2.3, the error should be accompanied by some data that will help the consumer, such as the stock exchange’s opening time. And as discussed in section 5.3.2, it could also be useful to add goals listing the available stock exchanges or providing details on a stock exchange’s market calendar and trading hours to prevent such errors.

Stock trading is outside the scope of our Banking API, but we saw an example of a functional limitation that impacted the API’s design in section 11.1.1. An international money transfer above a given amount must be validated by a human being; and, therefore, it cannot be represented by a basic request/response goal: an asynchronous representation must be used instead. It might be worth investigating if a solution can be found to omit the human-verification step in the international money transfer process. This would allow us to provide a real-time and more consumer-friendly synchronous goal instead of an asynchronous one.

But provider limitations are not only functional—they can also be technical. The bank details-verification service you saw in section 11.2.1 relies not only on the Banking Company itself, but also on other banks. To verify that a bank account exists at another bank, the Banking Company has to communicate with that bank. This service relies on an asynchronous, standardized, interbank messaging service. This interbank bank details verification system’s service-level agreement states that a verification must take less than five seconds.

Building a synchronous request/response API goal on top of such a system could be problematic. It would mean that the consumer might have to wait for up to five seconds for a response, which could seem like an eternity (especially for mobile consumers). Therefore, instead of a synchronous request/response goal, the API should let consumers send a verification request and then get the result later or even be notified by a webhook that a result is available.

Some limitations can be quite trivial, like “Oops, we don’t have an existing unique ID to identify transfers.” Don’t panic; in that case, you might want to use composite IDs composed of the various IDs or values needed to identify something. In this example, an accountId and a transferId could be used in a GET /transfers/{accountId}-{transferId} request. Note that if this composite ID is your business only, your visible interface contract might be opaque and only show a GET /transfers/{id}, the value of id being a composite ID returned by the list transfers goals.

Like functional limitations, technical limitations on the provider’s end must be questioned—but they must be questioned carefully. Don’t be fooled by the true technical limitation example you’ve just seen. Unlike functional limitations, which usually tend to be true problems that are not easily solved, technical ones are more often than not false limitations that can be solved with little effort through changes to the implementation. Such little effort avoids major impacts on the API design and, most importantly, the consumers.

I can’t count how many times I’ve heard things like, “We can’t aggregate this data; unitary calls already take too much time!” when all that was needed was to activate compression (see section 10.2.1), add missing indexes in the database, or optimize some database requests. Such a simple change can often result in awesome performance, allowing designers to implement the supposedly impossible feature.

As another example, not so long ago, in big, old companies discovering that the web was not only about websites that just used POST and GET HTTP methods, you might have heard that using the DELETE HTTP method is impossible; it’s blocked by firewalls. All that was needed to solve this problem was to talk to the people in charge of network security and explain the new needs so they could modify the firewalls configurations to allow the use of an HTTP method other than POST and GET. (Note that this specific HTTP method problem shouldn’t exist anywhere anymore; at least, I hope so!)

As an API designer, this means you should be aware of the whole chain between consumers and the point where the API is exposed and its actual implementation, and what is happening inside while designing, so that you can spot technical limitations as soon as possible and solve these problems either within the implementation or by adapting the design.

Technical limitations will usually revolve around response time, the scalability or availability of underlying systems, and network restrictions. For example, it is quite annoying to discover in production that an API request takes more than two seconds to complete, and this problem could have been solved by adding more CPU, optimizing the implementation, or, as a last resort, adapting the API’s design. Consumers could be unpleasantly surprised to discover that your API is unavailable for 15 minutes every day at midnight thanks to a daily reboot or backup procedure. It could also be unnerving to realize that each of your carefully crafted 5XX errors is replaced by a generic 500 Server Error whose body is an HTML page, thanks to a zealous old-fashioned firewall or a misconfigured API gateway.

The important thing to remember is that designing an API requires you to have a deep understanding of what really happens before and after requests are made so that you can spot possible functional or technical limitations. Any potential limitation must be questioned because it might be possible to totally or partially resolve the issue through the implementation (in a broad sense) without impacting the API’s design. Only through careful consideration of the problem will you be able to adapt the API design appropriately, should that become necessary.

Functional or technical limitations on the provider’s end can take many forms and have as many solutions based on adapted communication or adequate goals, input/output properties, or error handling. But whatever the solution, you must always conceal the provider’s perspective as much as possible in order to provide easy-to-understand and easy-to-use APIs.

11.3 Choosing an API style according to the context

When you’ve mastered or are used to using a tool like a hammer, it’s very tempting to treat all problems like nails. This is a cognitive bias called the law of the instrument, the law of the hammer, or Maslow’s law (https://en.wikipedia.org/wiki/Law_of_the_instrument). Such a bias can also have another effect: screwdriver users might think that a screwdriver is a better tool than a hammer, while hammer users might think the opposite. This could be called the fannish folk law.

But a hammer will not solve all problems, and a screwdriver is not better than a hammer; each tool is as useful as the other, but in different contexts. This book is about web API design, not carpentry or woodworking, but the same concerns apply in the tech industry too. Choosing which tool(s) you will use to design a remote API must not be done based on what you are used to, what is fashionable, or your personal preference; it must be done according to the context. And being able to choose the right tool requires you to know more than one.

Web APIs can easily be reduced to unitary and synchronous request/response + REST + HTTP 1.1 + JSON web APIs, which is nowadays one of the most commonly used ways to enable software-to-software communication in order to expose goals fulfilling targeted users' needs. Therefore, API designers could be tempted to use this set of tools in all situations, in all contexts. In this book, this toolset is only used to expose fundamental API design principles that you can use when designing other types of remote APIs.

We’ve already discovered some other tools that can be added to our toolboxes to be used in the appropriate contexts. In section 6.2.1, for example, you saw that JSON was not the only possible data format for APIs; you can use XML, CSV , PDF, or many other formats. You also saw in section 11.2.1 that sometimes it might even be counterproductive to use JSON in a context where consumers are used to an existing standardized XML format. In section 10.3.6, you learned that REST APIs are not the only option when creating web APIs. Using a query language might bring more flexibility when requesting data (but less caching possibilities). In section 11.1, you discovered that a synchronous request/response consumer-to-provider mechanism is not the only way of enabling communication between two systems. We can create asynchronous goals, notify consumers of events, stream data, and even process multiple elements in one call. And in section 10.2.1, you learned that HTTP 2 can be used instead of the good old HTTP 1.1 protocol.

We already know that context plays an important role in the choice of tools, and we already know about several different tools. But as API designers and software and systems designers, in general, we need to broaden our perspective in order to be sure to avoid the law of the instrument. In order to do so, we will explore some alternatives to REST APIs and web APIs in this section.

11.3.1 Contrasting resource-, data-, and function-based APIs

At the time of this book’s writing, there are three main ways of creating web APIs: REST, gRPC, and GraphQL. Will they still be there in five or 10 years? Will they still be the same? Only time will tell.

Is one of them better than the others? No! It depends on needs and context. The approaches shown in figure 11.10 represent three different visions of APIs: REST is resource-oriented, gRPC is function-oriented, and GraphQL is data-oriented, and each of these has its pros and cons.

Figure 11.10 Contrasting resource-, data- and function-based APIs

You should know by now what a REST API is. As you have seen throughout this book, and especially in section 3.5.1, a REST API—or RESTful API—is an API that conforms (or at least tries to conform) to the REST architectural style introduced by Roy Fielding.⁴ Such an API is resource-based and takes advantage of the underlying protocol (the HTTP protocol, in this case). Its goals are represented by the use of standard HTTP methods on resources with the results being represented by standard HTTP status codes.

⁴ See his PhD dissertation “Architectural Styles and the Design of Network-Based Software Architectures” at https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf.

In the Banking API, reading an account’s details could be represented by a GET /owners/123 request, returning a 200 OK HTTP status along with all the customer’s data if this 123 owner exists or a 404 Not Found HTTP status if not. Updating the same owner’s VIP status could be done with a PATCH /owners/123 request, whose body would contain the new value.

Relying on an existing protocol favors consistency and makes APIs predictable, as you saw in section 6.1. Indeed, upon seeing any resource, a consumer might try to use the OPTIONS HTTP method to determine what can be done with it, or even try the GET method to read it or PUT or PATCH to update it. Even the most obscure 4XX HTTP status code will be understood as an error on the consumer’s end by any consumer. Such an API can also take advantage of all the existing features of HTTP, such as caching and conditional requests; designers do not have to reinvent the wheel. Server-to-consumer streaming capabilities can be added too, using SSE (see section 11.1.3). But this does not make the design of the API simple.

You have seen throughout this book that even if the HTTP protocol provides some kind of framework, it does not magically prevent us from creating terrible REST APIs. It is still up to designers to choose resource paths (/owner or /owners?) and to decide how to represent data, provide informative feedback on errors or successes beyond HTTP statuses, and more.

The gRPC framework was created by Google. The g stands for Google and RPC stands for Remote Procedure Call . An RPC API simply exposes functions.

In a function-based API, reading the 123 owner could be done by calling the readOwner(123) function , and updating that owner’s VIP status could be done by calling updateOwner(123, { "vip": true }). The gRPC framework uses the HTTP 1.1 or 2 protocol as a transport layer, without using its semantics. It does not provide any standard caching mechanism. Note that it can take advantage of the HTTP 2 protocol to propose bidirectional and streaming communication. It can also use the Protocol Buffer data format, which is less verbose than XML or JSON (you can also use this format in a REST API).

Whereas in a resource-based API case, the underlying protocol provides some kind of framework, especially to describe what kind of action is being taken and what the result is, in a function-based API, it is usually up to the designers to choose their own semantics for almost everything. So, how would you represent a goal such as list owners? Should it be a listOwners(), readOwners(), or retrieveOwners() function ? The same goes when it comes to modifying data. Should the API provide a saveOwner() or updateOwner() function ?

For errors, the gRPC framework provides a standard error model including a few standard codes that map to HTTP status codes (https://cloud.google.com/apis/design/errors). For example, when calling readOwner(123), a NOT_FOUND code (mapping to a 404 Not Found HTTP status) can be returned along with an Owner 123 does not exist message. The error model can be completed with additional data in order to provide more informative feedback. As with a REST API, it is up to the designers to choose how to do that (see section 5.2.3) and also how to represent data.

We covered GraphQL briefly in section 10.3.6; it’s a query language for APIs created by Facebook. A GraphQL API basically provides access to a data schema allowing consumers to retrieve exactly the data they want. It is protocol-agnostic, meaning that any protocol that lets us send requests and get responses could be used; but because the HTTP protocol is the most widely adopted, it usually is the chosen one.

Like gRPC, GraphQL does not provide any standard caching mechanism. A POST /graphql request with the { "query": "{ owner(id:123) { vip } }" } query in its body would only return owner 123’s VIP status. And when it comes to creating or updating data, GraphQL behaves like any RPC API. It uses functions that are called mutations. Updating owner 123’s VIP status would require us to call the updateOwner mutation, which takes the owner’s ID and an owner object containing the new VIP status.

GraphQL also comes with a standard error model that can be extended. Listings 11.9 and 11.10 show a query and a response with a standard error, respectively.

Listing 11.9 A GraphQL query

{
  owner(id: 123) {
    vip
    accounts {
      id
      balance
      name
    }
  }
}

Listing 11.10 A GraphQL response with an error

{
  "errors": [
    {
      "message": "No balance available for account with ID 1002.",
      "locations": [ { "line": 6, "column": 7 } ],    ①  
      "path": [ "owner", "accounts", 1, "balance" ]  ②  
    }
  ],
  "data": {
    "owner": {
      "vip": true,
      "accounts": [
        {
          "id": "1000",
          "balance": 123.4
          "name": "James account"
        },
        {
          "id": "1002",
          "balance": null,    ③  
          "name": "Enterprise account"
        }
      ]
    }
  }
}

① Points to error in query

② Indicates the result property affected by the error

③ The actual property affected by the error

The query shown in listing 11.9 requests owner 123’s VIP status and account IDs, balances, and names. Unfortunately, as shown in listing 11.10, the balance could not be retrieved for the second account. The standard error model contains, for each error, a human-readable message, the possible sources of the error in the GraphQL query in locations (balance is on the sixth line and starts at the seventh character of the query), and the optional path of the affected property in the returned data (the null balance is in data.owner.accounts[1].balance).

Such an error seems to be the provider’s fault and not the consumer’s, but this is not indicated. It’s up to the designers to choose how to add information to this standard error model in order to provide fully informative feedback. And obviously, like in REST and gRPC APIs, it’s up to the designers to choose how to design the data model.

From a design perspective, we can see that these three different ways of creating APIs have three different ways of envisioning representations of an API’s goals: resources (REST ), functions (gRPC and also creations and modifications in GraphQL), and data (reads in GraphQL). Fundamentally, representing any read goal is possible in any of these API styles. When it comes to create, modify, delete, or do goals, they can be represented by a resource/method couple or a function. Each approach comes with more or less standardized elements favoring consistency and, hence, facilitating usability and design.

But whatever the provided framework, designers still have a lot of work to do in order to design decent APIs. Regardless of the API style they choose, designers still have to identify users, goals, inputs, outputs, and errors, and choose the best possible consumer-oriented representations while avoiding the provider’s perspective.

From a technical perspective, we have three different API tools or technologies that can be used over the HTTP protocol. The use of the HTTP protocol is important because it is widely accepted, and you usually do not need many, if any, modifications to your infrastructure to host or use an HTTP-based API. There are some differences between the three tools, however.

REST APIs rely on the HTTP protocol and can benefit from features such as content negotiation, caching, and conditional requests. GraphQL and gRPC do not provide such mechanisms but have some other interesting features. Thanks to the use of HTTP 2 and the ProtoBuf data format, gRPC-based APIs can provide high performance. They also provide streaming and bidirectional communication between consumer and provider. (Note that REST APIs can provide one-way streaming from provider to consumer with SSE.) And as seen in section 10.3.6, GraphQL’s querying capabilities let consumers get all the data they want, and only the data they want, in a single request, but at the expense of caching capabilities.

Concerning the provider’s context and especially the implementation, you obviously don’t have much control over the queries that could be made by consumers in a data-based API. In non-infinitely-scalable systems, too many complex requests could result in a load higher than the underlying systems can support and terribly long response times if the implementation is not ready to prevent that. With a resource- or function-based API, it is quite easy to avoid such problems. Because each goal’s behavior is usually predictable, the solicited systems are known, and rate limiting can be used to protect the underlying systems. You can specify that each consumer cannot make more than x requests per second on the API, and even specialize this rate limiting by consumer and/or goal.

For data-based APIs, you could limit the number of queries or their size, but that would be pointless because it would not prevent unexpectedly complex queries from being made. You could limit the number of nodes in a request (containing one or more queries) or accept only preregistered requests, but that would be done at the expense of flexibility, making the data-based API choice almost useless. In all cases (REST, gRPC, GraphQL), a good practice would be to limit the number of items returned by default in lists.

So, which approach should you use? Such a choice cannot be made prior to analyzing your context and needs. Once you know who your consumers are and understand their contexts, the goals they need, and how they will be used, and you understand the provider’s context, you can choose what kind of API will be the most appropriate. Although each context will be different, nowadays the rule of thumb is to choose REST by default. If there are very specific needs that cannot be fulfilled by a well-designed REST API, you might want to try GraphQL or gRPC.

Choosing REST by default could be seen as an example of the law of the instrument or fannish folk law, but the REST approach is capable of fulfilling most needs. It is the most widely adopted way of creating APIs, and most developers are used to it (remember section 11.2.1). Choose GraphQL for private APIs in mobile environments only if a well-designed REST API hosted in a well-configured environment is not possible (see section 10.2), and if

You actually need advanced querying capabilities.
You do not plan to make your API public or share it with partners.
You do not care about caching.
You are sure to be able to protect the underlying systems through the implementation or through infinite scalability.

Finally, choose gRPC APIs for internal-application-to-internal-application communication only if milliseconds really matter, if you do not care about caching or you are willing to handle it without relying on HTTP, and if you do not plan to make the API public or share it with partners. Also bear in mind that this choice might not be exclusive. You have already seen in section 10.3.8 that different layers of an API can fulfill different needs. Building a mobile BFF exposing a GraphQL API or a more specialized REST API is totally legit. An application can also expose a gRPC interface for internal consumers and a REST interface for external ones.

11.3.2 Thinking beyond request/response- and HTTP-based APIs

As API designers, we must be aware that request/response HTTP-based APIs are not the only way of enabling communication between applications. We talked about events and streaming in section 11.1, but mostly from an HTTP perspective. When you build an event-based system, a provider can notify consumers about events using a webhook or WebSub system, both HTTP-based. But this could also be done using a messaging system such as RabbitMQ. If it’s for internal purposes, it might be more effective to directly connect providers and consumers to such tools.

When dealing with the Internet of Things (IoT), energy consumption efficiency is a key concern, and two-way communication over unreliable networks or with sleeping devices is almost a standard. The Message Queuing Telemetry Protocol (MQTP) is a message-based protocol designed to deal with such constraints. When streaming events, you can use SSE over HTTP for provider-to-consumer communication. But this could also be done using the WebSocket protocol, which is not HTTP-based (as seen in section 11.1.3). And if you need to process a massive flow of events, Kafka Streams could be an option.

An entire book would be needed to talk about the design and architecture of event-based systems, and that is not this book’s purpose. But you can at least take advantage of what you have learned in this book to design the event notifications and streams. The point to remember here is that HTTP-based communication is not the only option; and, in certain contexts, it should actually be avoided at all costs. In the next chapter, you will discover the different types of API documentations and how designers can participate in their creations.

Summary

Unitary request/response, consumer-to-provider communication is not the only option; you can also design asynchronous goals, notify consumers of events, stream data, and process multiple elements in one call.
Designing APIs requires us to be aware of the consumers' contexts, including their network environments, habits, and limitations.
Designing APIs requires us to carefully consider the provider’s limitations, to spot these earlier, and to solve problems without impacting the design (if possible) and adapting the design (if not).
Designing APIs requires us to ignore fashion and personal preferences. Just because you like or know a certain tool/design/practice doesn’t mean that it will be the ideal solution for all API design matters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.