drunomics: Why we don't use GraphQL
At drunomics we are building decoupled Drupal sites for more than five years. During this time, GraphQL has always been a popular choice for decoupled Drupal sites among professional or enterprise projects, thanks to the well maintained GraphQL contrib module. Still, I've vetted against using GraphQL for various enterprise projects, even though sometimes it was appealing to customers. In this blog post, I'd like to summarize why we don't use GraphQL:
General complexity
GraphQL is not only a new query language to learn for both frontenders and backenders, moreover the backend has to support any kind of queries the frontenders make. On the frontend side of things, additional libraries and tooling is needed to handle the protocol.
Loose contracts
GraphQL gives a lot of power to frontend developers, but that comes with a huge price: No defined or a very loosely defined contract, i.e. the data model or more specifically the GraphQL schema layered on top. Based upon this loose contract the frontend may compose any kind of queries, which the backend has to support. What leads to the next point:
Complex queries
When the backend is exposing the Drupal data schema directly, potentiallly a lot of things become leaked unwanted and changing things might became hard, because: Who knows what data properties the frontend uses and queries for? It's quite hard to optimize for every use-case.
However, the backend may compose it's own GraphQL schema and provide exactly the data model as needed by the client, the frontend. That's indeed, a great option to have, but it requires additional work and code to translate between the schema and the real data model behind. It makes it possible to change the underlying data model and schema mapping, while staying with the same or compatible GraphQL output and schema. But is that code performing the mapping performant enough? Does it work correctly? That's quite hard to tell without knowing exactly the queries one has to optimize and test for. So things are or become complex.
Performance
First of all, GraphQL is bad for caching since it makes use of POST requests. The typical work-around is to use shortened, hashed queries and to access them via GET requests, what can help to mitigate the issue. But this comes at the cost of tying the deployed frontend and backend versions, thus increasing overall system and deployment complexity. That way, the main GraphQL advantage - flexibility at the frontend - gets lost. So not an easy or great compromise to make.
Client driven data fetching
With GraphQL, the web browser (or generally the client) sends a query to the server, specifying the exact data it needs. While this can help to reduce payload size, it puts the client in the "driving seat". That often leads to additional round trips being required: Based upon the first request, often additional data is required for rendering it. This additional data often has to be fetched in additional requests, thus requiring another or multiple round-trips to the server and thus increasing latency.
In contrast, when the server is in the "driving seat", it may efficiently do all queries and resolve additional data, and then send the resulting data over the slower network once.
Security
GraphQL queries can expose sensitive data if not properly secured. This can be mitigated by implementing proper authentication and authorization mechanisms. However, this can get very complex easily: Since the server does not know the queries needed by the client, it needs to handle every possible combination a client may request. Unfortunately, it's commonly rather easy for hackers to purposely write computationally very expensive (GraphQL) queries and to send them to the server, thus opening the door for DDOS or even DOS attacks.
Besides that, due to the complexity of the backend having to cover all possible combinations, the danger for data leaking accidentially becomes rather high.
The conclusion
GraphQL comes with a couple of issues, which are - as usual - solvable. That's a price one might want to pay in certain situations, if the benefits are worth it. Thus, is using GraphQL a good idea? As so often, it depends. But in my experience, it's more often not, than it is.
Alternatives are RESTful
The typical alternative to GraphQL is a RESTful API. As usual, with Drupal there are a couple of good options:
- Drupal comes with the JSON-API out-of-the box, which is a great feature to have. While it's good fit in certain situations, it also faces some of the issues mentioned above, most notable "Client driven data fetching" and "Loose contracts".
- Developers may use Drupal's API to provide custom-coded RESTful endpoints for the client. That addresses all mentioned concerns, but requires backend development time for every feature and most notable careful planing. This comes with the downside of frontend developers loosing the flexibility. (By the way, this is what GraphQL is loved for!)
- Configurable RESTful endpoints. In order to improve the development process and gain flexibility in the frontend, we developed a solution for providing custom RESTful endpoints that are configurable via Drupal, by frontend developers. For that, we improved the Custom Elements module, which is part of Lupus Decoupled Drupal, such that it integrates with Drupal's configuration sytem and provides an UI for customizing output by entity view-mode. That way, in many situations, we can tick all the boxes, while enabling the frontend developer to work efficiently. I'll share more details about the new Custom Elements UI in a dedicated blog post later this week.