GraphQL and the N+1 Problem: Solving it with DataLoader
It’s 2018, and everyone is moving from REST to GraphQL. The promise is enticing: "Fetch exactly what you need, and nothing more." But there's a performance trap waiting for the unwary: the N+1 query problem.
What is the N+1 Problem?
Imagine a query to fetch 10 posts and their authors:
query {
posts {
title
author {
name
}
}
}
A naive resolver implementation would look like this:
- One query to fetch all posts (1 query).
- For each post, a separate query to fetch the author (N queries).
If you have 100 posts, you just made 101 database queries for a single request!
The Solution: DataLoader
Facebook released DataLoader, a library that uses batching and caching to solve this. Instead of fetching each author individually, DataLoader "waits" until the end of the event loop tick, collects all the requested IDs, and fetches them in a single batch query.
const DataLoader = require('dataloader');
// 1. Create a batch loading function
const batchUsers = async (ids) => {
console.log(`Fetching IDs: ${ids}`); // Should only log once!
const users = await db.users.find({ id: { $in: ids } });
// Important: The returned array must match the order of IDs
return ids.map(id => users.find(user => user.id === id));
};
// 2. Instantiate the loader per request
const userLoader = new DataLoader(batchUsers);
// 3. Use it in your resolver
const resolvers = {
Post: {
author: (post) => userLoader.load(post.authorId)
}
};
How it Works Under the Hood
DataLoader uses process.nextTick() (in Node.js) to schedule the batch function. When userLoader.load(id) is called:
- It pushes the ID into a queue.
- It returns a Promise.
- Once the current execution stack is empty, the batch function is called with all queued IDs.
- The individual Promises are resolved with the results from the batch.
Memoization Cache
DataLoader also has a built-in cache. If you request the same author multiple times in the same request, it will return the cached Promise immediately without calling the batch function.
In 2018, building a GraphQL API without DataLoader is like driving a car with the handbrake on. It’s an essential tool for any production-grade backend.