Rethinking an API for Cacheability

At the beginning of anything new, there is just so much uncertainty. The first versions of an API built to support a growing, yet-to-be-defined product, can easily be built on top of quicksand. You’re doomed to make mistakes, but if you overcome those errors and get something tangible you may get the second chance: redoing, but better.

Redoing gives you the possibility of starting with the favorable, unfair, advantage of knowing (a little) about what doesn’t work. You may not know precisely what works, but you know a few things that doesn’t.

And this is amazing for your API.

I recently wrote about our API migration from REST/JSON to GraphQl at letsevents.com.br. A migration between different API query protocols is an opportunity for redesigning your API as a whole. If you limit it to a protocol translation you’re not using all the additional knowledge acquired the first time and wasting resources. By the way, if you decide to redo something, make sure you do it clearly better to compensate at least for the loss in market time.

Our original API grew like a little cute monster — as most software tend to do. As new features needed to be implemented, stuff was added to the API, on top of what was already there. Fields. Endpoints. Relationships. More fields.

As the figures in a geometric pattern from a coloring book, there was no explicit starting point, no well defined way things were supposed to communicate, although they were somewhat correlated.

Our REST API as a geometric pattern from a coloring book
It is quite hard explore an API like this for the first time without reading detailed, boring documentation.

Fresh start: what to return?

In the API world we looked at that question in terms of data vs. policies. Policies are a subset of the whole business logic, which somehow includes also your data, or, at least, the way you model and validate it. Our APIs were full of policies and by rethinking our design we realized these were actual prejudicial.

Our business domain talks about Lists of people who attend to Events. On every List, we have many Invitations, which roughly represents the attendance of a person to an Event through a List. We have a logged in user (Viewer) that may be allowed or not to delete any one of the invitations in each list of an event. This question can be answered with an authorization policy. This would be calculated, server-side, for each invitation in an event (which may have thousands of them).

For simplicity, let’s say we could calculate this policy considering data about the Event (it may be owned by the viewer), the List (it may have been created by the viewer), or the Invitation (he may be either the invitee, inviter or both). Since we had thousands of invitations, performance could become an issue, but we could cache these calculations to prevent repeating them every time, right?

Kind of.

Policies usually depend on who is asking. Caching would need to happen on a per-viewer basis, and given our permission model includes multiple staff members for the event, caching could help but wouldn’t be really efficient, performance and memory wise.

But if our API would only provide the final products (data) we have built, instead of policies, caching would become so much easier. Caching Invitation responses could be as trivial as handling a few timestamps, as it doesn’t depend on the Viewer anymore. It’s up to our clients to calculate the policy themselves from the raw data returned by the API, but this is hardly a problem as most policies can be expressed as simple boolean equations.

Having the data may be another issue, specially if you have an api that raises rigid walls within your data. But once you have the required data, you’re good to go.

Policies are just one example of information an API may return that depends on multiple contexts. Whenever one such calculation can also (safely) be done client side, you should consider it. Obviously, delegating the calculation of policies to the client side doesn’t mean your server doesn’t need to enforce them, whenever a client request an operation on your data.

Originally posted on Medium
https://medium.com/@samuelbrando/rethinking-an-api-for-cacheability-3a4f7910f9dc

Endpoints raise rigid walls within your data

Relationships allow you to build amazing products and services. Don’t let your API get in the way.

We were quite happy with our REST api at letsevents.com.br, until we needed to render a lot of data on a single request, which was quite painful performance-wise. We were in our way to break down the data requirements by analyzing scenarios and tailoring requests to the essential information. From the webapp perspective, we would define what data was needed for each use-case, so that we could gradually request it in tiny little pieces, on demand.

… but …

Endpoints raise rigid walls within your data.

And we used to need those walls. They helped us reason about the server-client data flow, limit the context of authorization policies, name things. They helped us limit the exact quality that makes our data special: the relationships within.
Oops..!

With a resource-centric api, achieving a granular data retrieval logic often requires multiple requests. You would be happy if you could get away with requests you can paralelize. But the truth is harsh and on most cases they’re quite interdependent. You know, since your data is full of relationships.
So we acknowledged that going granular increased our client complexity, and perceived performance would also suffer.

We realized we had to change focus: to write an API that expands possibilities, allowing clients to name new relationships, extracting information we did not knew we had. The cool thing is, facebook had already thought about this and developed Graphql.

In less then a week we’ve made a proof of concept implementation of a GraphQL api using the amazing ruby gem, and had most of our REST api translated (oops!). We’re talking about implementing roughly the requirements that took our engineering team months do develop.

Adapting our Backbone.js based client to use the new api was easy and straightforward: we just needed to redefine the Backbone.sync behavior and tweak some variables. Nothing major.

But translating is not enough. We’re now rethinking the ways clients interact with our data given the new capabilities brought by GraphQL. In this process we’ve learned a few more things, like how nice to have a self-descriptive api, with an amazing interactive console, and not having to write or maintain so much documentation.
I’ll let you know how that goes any time soon :)

Originally posted on Medium
https://medium.com/@samuelbrando/endpoints-raise-rigid-walls-within-your-data-8f47c0bc2667