The real story behind Apollo GraphQL

An interview with Apollo CEO, Geoff Schmidt, diving deeper into the origins of the supergraph.

Following my recent post reminiscing about the origins of Apollo GraphQL and imagining its future, Apollo CEO, Geoff Schmidt reached out to me to provide some color on the real story behind his life’s work. Geoff confessed my mention of the Distributed Data Protocol made him nostalgic and proceeded to explain Apollo GraphQL’s background in fascinating detail. What follows is an interview edited from our original conversation.

Sam Hatoum, Xolv.io: In my post I presumed that, given Meteor’s expertise in distributed data, you, Nick Martin, Matt DeBergalis, and your team saw great potential in expanding on the original GraphQL spec. What was it exactly that inspired you guys to start Apollo GraphQL?

Geoff Schmidt, Apollo GraphQL: Apollo began life as the Meteor 2 data system, designed from all of the learnings from Livedata, which ultimately could be summarized as: there needed to be an abstraction layer in between the client and the server, rather than embedding MongoDB queries in the client. There were a diverse range of drivers for that, from the need for first-class SQL support, through the demand we were seeing in the enterprise to connect Meteor to a wide range of backends (not just databases but existing message busses, APIs, etc.), to reducing the tightness of the coupling between the frontend and the backend.

We noticed devs preferred embedding data dependencies of a UI component in the Blaze template rather than create a new Distributed Data Protocol subscription on the backend, which was starting to cause friction in larger teams. We also saw that Meteor was being used in larger and larger scale apps and needed to keep pace with that, and while we never met a Meteor app that we couldn't make scale, sometimes it needed a PhD in Meteor to do it.

At the same time, we saw how React was growing faster than Blaze because it was incrementally adoptable into existing enterprise apps and we saw the failure of Cordova/Phonegap to capture meaningful market share for AAA mobile apps. We also noticed the strong desire of the community to reuse the same DDP layer across not just their webapps but their native mobile apps as well.

SH: Interesting that Apollo is the Meteor 2 data system. How did your learnings from Meteor inform the vision for Apollo?

GS: Our goals for Apollo were very clear right from the start. The ambition we had was to create a next generation data system that (1) worked with any new or existing backend, be it database, API, or message bus, (2) worked with any frontend—web, native mobile, or IoT—(3) could scale to teams of thousands of engineers, and (4) could scale to the world's largest websites (e.g. power the front page of the New York Times, which Apollo in fact now does).

Another goal we had was to make this data system available in a form that could be incrementally adopted into existing enterprise apps—and prove that by making it available as a package that could be adopted separately from the rest of Meteor.

SH: What needed to be done differently to achieve those goals?

GS: We admitted to the superiority (at least in the enterprise) of what I'd been calling the "Derby model". DerbyJS was an early competitor to Meteor. It made you create an abstract schema representing your data, and then you could map that schema back to specific databases. This had the conceptual promise (at least) that you could switch your backend from SQL to MongoDB and back again without having to rewrite much of your code, because you were writing against this abstract schema through an ORM.

In the original Meteor design, I took the opposite position i.e. no abstraction layer, and instead adopted an isomorphic approach where clients could write actual MongoDB queries against the actual MongoDB schema, mediated by a normalized client-side cache/DB emulator. The idea was that SQL would be supported the same way, so you'd write actual SQL on the frontend.

The reason for this was to expose the actual capabilities of the database. SQL databases, NoSQL databases, graph databases, etc., all can do fundamentally different things, and I thought that by putting the database behind an abstraction layer, your database would be reduced to the lowest common denominator. That isomorphic architecture turned out to be a great way for a small team to write a greenfield app super quickly.

However, we were seeing that the tight coupling it created between frontend and backend made enterprise development harder than it needed to be. If you are integrating with 20 different enterprise backend services you don't want to have to know the technology that powers each one. It is then worth investing in a Derby-style abstract schema, even though that's more effort and complexity up front.

Similarly, and also to accommodate larger and more complex apps, we would move UI component data dependency specification from the backend (a DDP "publication") into the UI components, by letting the frontend send a query over the wire to the backend. The reason we didn't do this in the first place was security—I thought it would be hard to convince people to use Meteor if the client was sending arbitrary clients to the server. The DDP publication model let us say that Meteor's security model was fundamentally no different from REST's—clients can only run queries that have effectively been explicitly whitelisted by the server by creating a publication.

This was in contrast to the model used by Asana's Luna framework, which used "cosimulation" to compute data dependencies—the server had a copy of the client's code, and would effectively walk the DOM to determine what data the client would need. That had both performance and security challenges, so we went with a much more conservative model for Meteor because I didn't want to take on securely executing arbitrary client-side queries on top of all of the other technology we were building. But we discovered that as codebases get bigger, that tight coupling between frontend and backend becomes more of a problem. So we decided to bite the bullet and build a general purpose query planner that was secure and performant enough to accept arbitrary client-side queries, and the needed tooling around it such as query whitelisting.

SH: When did GraphQL come into the picture?

GS: We already knew that our new data engine needed both an abstract schema definition language and a query language against that abstract schema. I had started on an in-house design, but we ultimately decided to use GraphQL as the query language for Apollo, even though at the time it lacked some things that I considered critical features. We wanted to build a broad user base quickly, and we thought that if we made Apollo speak GraphQL, it would help us market it to React users who were being told by Facebook that GraphQL was the future.

SH: I remember the initial enthusiastic reaction of the dev world when you launched Apollo GraphQL. What do you think were the reasons behind that?

GS: Apollo quickly took almost 100% "GraphQL market share" because what people were really looking for wasn't a new query language, but a new distributed data engine. And our data engine, built from all of the learnings of Meteor, was exactly what was needed, independent of the syntax used to query it. By contrast, Relay repeated some of the same mistakes that Meteor made early on—it had some cool computer science but you needed a PhD in Relay to understand how it worked, and you couldn't connect it to arbitrary backend systems because it made strong and rigid assumptions, such as a global namespace of IDs, to make its magic possible.

Interestingly, from speaking with FB engineers before the release, I believe that FB's original plan was to release Relay without GraphQL because they thought that the valuable part was the normalized cache algorithms in Relay, while the proprietary “GraphQL” syntax used to talk to the FB backend to feed that cache was an awkward implementation detail. I was told that they were working on a REST binding of Relay, but at some point they evidently abandoned that and decided to document the GraphQL spec instead.

We then ultimately wrote the Apollo Federation spec to put the most critical missing pieces from our design back into GraphQL, and I think the success of Federation really vindicates the idea that those elements of our original design (and of DDP's!)—such as a robust concept of primary keys—are important to a workable system. At the same time, I do think that adopting GraphQL as the query language, despite its pre-Federation limitations, was hugely helpful in bootstrapping a broad and diverse community.

SH: The supergraph concept is an independent innovation. When did you come up with that?

GS: The supergraph vision actually goes back to DDP and somewhere deep in the archives there are designs for DDP v2 with all sorts of cool features that will eventually show up in Apollo. In fact, the pitch we gave to our investors Andreessen Horowitz and Matrix back in 2016 to justify the initial Apollo OSS investment was very similar to our current supergraph vision.

SH: Looking back at your open-source journey, what did you learn and how things changed for you over all these years?

GS: One of our biggest learnings over the last decade has been coalition building, and patience—how to divide a big vision into small pieces and to use the ability to build a broad coalition around each piece as the industry standard as the gate to advance to the next piece. That's slowed down our pace of technological innovation compared to the Meteor days, but enormously broadened the adoption of the platform. It's also created a situation where we have access to an essentially unlimited amount of capital to build the next great thing, even in the current macroeconomy. It's really wonderful to see that patience pay off and to see us start to be able to get some more pieces of the bigger supergraph vision on the board.


A big thank you to Geoff for telling that origin story in such great detail!

I had an intuition that the spirit of DDP was instilled in the supergraph. I was once told by an art enthusiast friend that every artist has a center and you can immediately recognize it in all their work once you see it. Before speaking to Geoff, I made close assumptions about Apollo’s origins, but I was truly fascinated by his correction and insider details.

As someone passionate about the product development lifecycle, I’m impressed by Apollo’s innovative vision that took the API world by storm, and the patience, confidence and execution that it took to create a platform that is now a vital part of modern development stacks.

Let me know if you have any questions or thoughts in the comments below.


Let us help you on your journey to Quality Faster

We at Xolv.io specialize in helping our clients get more for less. We can get you to the holy grail of continuous deployment where every commit can go to production — and yes, even for large enterprises.

Feel free to schedule a call or send us a message below to see how we can help.

User icon
Envelope icon

or

Book a call
+
Loading Calendly widget...
  • 5 insights on GraphQL adoption in the enterprise

    Or what I learned from the Humans in the Graph roundtable at the GraphQL Summit.

  • Why Apollo's supergraph announcement is a pivotal moment for GraphQL

    Tracing back the origins of the Apollo GraphQL platform from an insider’s view to uncover a broader perspective on what’s to come.

  • Why software architecture matters to you and your customers

    Or why non-developers should care about good software design, even if they can’t see it.