Software Calamity: December 2013

Stateless : A Best Practice That Isn't Absolute

One of the best practices to follow when designing RESTful APIs is to make the requests stateless.

In simple terms, it means that an API request should contain all the parameters it needs for the server to execute it without depending on the result of a previous request.

In theory, stateless requests brings the following benefits:

Scaling: Any server in a cluster can handle the request.
Simplified Caching: It's easier to cache the result of a request by using the explicit request state as a cache key.
Simplified Troubleshooting: When parts of the request the state aren't explicitly stated, additional investigation is required to figure it out.
Reduced Learning Curve: Each API request is self contained. To call the required API, developers don't need to learn or use multiple APIs in sequence.

Following this best practice can be quite challenging since the software requirements sometimes conflict with it. For instance, any API that requires authentication usually relies on one API request to authenticate the user in order to supply valid credentials to subsequent requests.

Since this best practice isn't entirely "black" or "white", I found that defining different levels of statefulness helps when I need to develop new APIs or explain the concept.

Statefulness Levels

Permanent Data
Changing Data
User Specific Data
Session Specific Data
Random Behavior

Level 1: Permanent Data

The request passes explicit state data, the response will always be the same for a specific state.

Permanent stateless resources: A data request done on a read-only data warehouse that stores historical information will never change.
Permanent resources with a state: A resource may be delivered in different languages, the language selection is explicitly in the request so that the following requests will yield a potentially unique result.
Example:

http://server.com/product/1234&language=french
http://server.com/product/1234&language=english

Caching those responses is trivial.

Level 2: Changing Data

The request passes explicit state data, the response may change over time, even if the state combinations are the same.

Examples:

Files: A picture stored at: http://server.com/picture.bmp , that picture might change over time and might eventually be taken offline.
Dynamic Data: A business object stored at : http://server.com/product/1234 , that specific object might change over time and might eventually be deleted.

Caching those responses is relatively easy. The challenge lies in defining caching rules to ensure that entries get invalidated in a reasonable time frame.

Level 3: User Specific Data

The response may look different based on the user performing the request.

Access Control: Different users might get a different result for the request.
Example: Given http://server.com/product/1234, if some users don't have access to it, they will receive a "403 Forbidden" response while the other users with access will receive the information on the object.

Hidden Data: The response data is based on information hidden from a user.
Example: The API http://server.com/publicity is used to retrieve the publicity you see on a page, the content of this publicity usually depends on information associated with your user account. This information isn't available to the client performing the request.

Personal URL: A URL for which the response is bound to be different between users.
Example: The URL http://server.com/my-cart. The response data will change based on the user performing the request.

Note: this can be avoided if the URL could explicitly state the cart identifier in the request such as : http://server.com/cart/5678. This way, API developers can deal with more than one cart. Users might also be able to share their cart with other users.
User Preferences : An API that leverages user preferences stored on the server.
Example: Users store their preferred language on the server. When different users perform the request http://server.com/product/1234 they each get a different result based on their preferred language.

Note: This can be avoided by tracking the state on the client side. The client application can read the user preferences and then use them explicitly in every request:

Load the current user preferences at http://server.com/user/4567/preferences, returns language=french
Reuse the loaded preference in the request http://server.com/product/1234&language=french

Caching those responses is more difficult. On the server side, extra steps are required to determine the correct cache key to use. Edge caching can be impossible since the edge servers won't have all the data they need to return the right response.

Level 4: Session Specific Data

The response data may look different based on the user session performing the request.

Session Preferences: The server stores user preferences based on their session.
Example: A user logs in twice to the same server using two different applications. He configures one application to be French and another in English. The user then performs the request http://server.com/product/1234 on both application and gets a different result.

Note: this can be avoided by making the client application track the selected language state.

Caching those responses is as difficult as those of level 3.

Level 5: Unpredictable/Random Behavior

The response may look different, even if executed with the same parameters with the same user in the same session.

Random Behavior: The behavior isn't predictable.
Example: An API that returns a random number.

Lower is better

With those levels, the goal is to reach the lowest possible level to gain the benefits of statelessness while satisfying the application requirements.

I hope you'll find them as useful as I did.

Software Calamity

Tuesday, December 31, 2013

5 Levels of REST API Response Statefulness