1. Graphql Basic
1.1 Ideas
Graphql is a language
Graphql is a brand-new language, just like other programming languages:
query {
author(id:1) {
id
firstName
lastName
}
post(id:1) {
id
title
}
}
The backend server need to parse this query string and handle the request.
Always HTTP POST Request
Graphql requests are always sent as an HTTP POST request, with the body like:
{
"query": "mutation {\n upvotePost(postId:1) {\n id\n title\n author {\n id\n firstName\n lastName\n }\n votes\n }\n}",
"variables": null
}
curl:
$ curl -X POST -H 'Content-Type: application/json' -d '{"query":"mutation {\n upvotePost(postId:1) {\n id\n title\n author {\n id\n firstName\n lastName\n }\n votes\n }\n}","variables":null}' http://localhost:3000/graphql
Graphql is a string when making query
As you can see in previous example, graphql is a customized language, it’s query will be sent as a string inside the POST json body (field query
).
Graphql is explicit
Means you will need to write down every field, every single character of your requests, no implicit allowed.
Wrong:
query {
friend {
"""
client: I guess you know what I want, no?
server: NO!
"""
}
}
Correct:
query {
friend {
"""
client: Hey dude, I need A B C D ...
server: Sure!
"""
name
age
hobby
}
}
1.2 Example
Let’s have an example to get familiar with the Graphql implementation. I won’t touch too much basic grammars in this post, you can find them here: https://graphql.org/learn/.
graphql-tools
Graphql is a design / concept
initially by Facebook:
- Official site: https://graphql.org/
- Spec: https://spec.graphql.org/
So Graphql itself is NOT AN IMPLEMENTATION
. When you need to do real app developing, you should choose an implementation / tool / library.
Apollo is a choice, and the pioneer in this area. graphql-tools is another, and more developer / project friendly.
In short, Apollo wants to build an ecosystem, from frontend client to backend, all the tools provided by their company, and it has paid service. On the other hand graphql-tools is a set of tools to help you develop the graphql backends (or mostly your schema), it won’t limit you, you even can choose your favorite web server (express or koa or others) without any restriction.
So in the example, I will show you how to use graphql-tools.
Types
All your models
and apis
or others are all recognized as typeDefs
by graphql-tools. You can put them in a single *.graphql
file or multiple files.
type Author {
id: Int!
firstName: String
lastName: String
"""
the list of Posts by this author
"""
posts: [Post]
}
type Post {
id: Int!
title: String
votes: Int
"""
the Author of this post
"""
author: Author
}
# the schema allows the following query:
type Query {
posts: [Post]
post(id: Int!): Post
author(id: Int!): Author
}
# this schema allows the following mutation:
type Mutation {
upvotePost(postId: Int!): Post
}
You only need to pay attention to:
-
Query:
- Actually it’s not a
type
, it’s the root entrance of all the query apis - Put all your GET RESTful apis here
- This area is the core of the Graphql server
- Actually it’s not a
-
Mutation:
- Same as above, it’s the root entrance of all the mutation apis
- Put all your post/put/delete RESTful apis here
- This area is not too
GRAPHQL WAY
, a bit ugly, e.g a lotupdateXXX
deleteXXX
etc
Resolvers
resolvers
is the actual business logic handler in Graphql backend. It’s used to answer the query of specific type, and return the data client need.
The signature of resolver is fixed:
type someResolver = (source: undefined, args: Record<string, any>, context: YourContextType, info: GraphQLResolveInfo) => Promise<SomeGraphqlType>;
-
source:
- if current resolver is a root level resolver, this source is always undefined
- if current resolver is a sub-resolver, it will get the parent resolver’s response as source
- see example below:
Author.posts
orPost.author
- args: the graphql arguments, find the doc here
- context: the context object initialized at graphql server startup, will talk about the detail later
- info: the graphql server runtime details, about the runtime context of current request, in most cases you won’t need this, doc here
Have an example. First, see the definition part in typescript:
export interface Context {
postDb: PostDB;
authorDb: AuthorDB;
}
export interface RootResolvers<TSource = undefined, TArgs = Record<string, any>, TContext = Context, TInfo = GraphQLResolveInfo> {
Query: {
posts: (source: TSource, args: TArgs, context: TContext, info: TInfo) => Promise<Post[]>;
post: (source: TSource, args: { id: number }, context: TContext, info: TInfo) => Promise<Post>;
author: (source: TSource, args: { id: number }, context: TContext, info: TInfo) => Promise<Author>;
};
Mutation: {
upvotePost: (source: TSource, args: { postId: number }, context: TContext, info: TInfo) => Promise<Post>;
};
Author: {
posts: (source: Author, args: TArgs, context: TContext, info: TInfo) => Promise<Post[]>;
};
Post: {
author: (source: Post, args: TArgs, context: TContext, info: TInfo) => Promise<Author>;
};
}
Second, the implementation in typescript:
export default {
Query: {
posts: async (_o, _a, ctx) => ctx.postDb.findAll(),
post: async (_, { id }, ctx) => ctx.postDb.findById(id),
author: async (_, { id }, ctx) => ctx.authorDb.findById(id)
},
Mutation: {
upvotePost: async (_, { postId }, ctx) => {
const post = await ctx.postDb.findById(postId);
if (!post) {
throw new Error(`Couldn't find post with id ${postId}`);
}
post.votes += 1;
return post;
}
},
Author: {
posts: async (source, _args, ctx) => ctx.postDb.findByAuthorId(source.id)
},
Post: {
author: async (source, _args, ctx) => ctx.authorDb.findById(source.authorId)
}
} as RootResolvers;
The most important point is: all the type
you defined in the schema, you have to provide a resolver for it, to give the response.
Context
Server startup phase, the context will be initialized and provided to the graphql-tools. Commonly it contains the facilities, like DB or Cache etc.
const context = {
postDb: new PostDB(),
authorDb: new AuthorDB()
} as Context;
app.use(
'/graphql',
graphqlHTTP({
// ...,
context,
// ...
})
);
Startup
For server startup, it’s just like a normal express server startup. We will use express-graphql as additional middleware to handle all the graphql requests.
Need to prepare the schema
of your Graphql server, then provide it to the middleware.
Example of building schema:
import * as LibPath from 'path';
import { makeExecutableSchema } from '@graphql-tools/schema';
import { mergeResolvers, mergeTypeDefs } from '@graphql-tools/merge';
import { loadFiles } from '@graphql-tools/load-files';
const makeSchema = async () => {
const typesSourceStrArr = await loadFiles(LibPath.join(__dirname, 'types/**/*.graphql'));
const types = mergeTypeDefs(typesSourceStrArr);
const resolversSourceStrArr = await loadFiles(LibPath.join(__dirname, 'resolvers/**/*.js'));
const resolvers = mergeResolvers(resolversSourceStrArr);
return makeExecutableSchema({
typeDefs: types,
resolvers
});
};
export const schemaPromise = makeSchema();
It will read all the *.graphql
type files and all the resolvers/**/*.js
resolver files, then make a schema. Of course if you have more, like fragments or directives, they should all be handled here.
Startup:
const app = express();
const schema = await schemaPromise;
app.use(express.json());
morgan.token('body', (req, res) => JSON.stringify((req as any).body));
app.use(morgan('[:date[clf]] :method :url :status :req[content-length] :body'));
const context = {
postDb: new PostDB(),
authorDb: new AuthorDB()
} as Context;
// eslint-disable-next-line @typescript-eslint/ban-ts-comment
// @ts-ignore
app.use(
'/graphql',
graphqlHTTP({
schema,
context,
graphiql: true,
customFormatErrorFn: (error) => {
console.log(error);
return {
message: error.message,
locations: error.locations,
stack: error.stack ? error.stack.split('\n') : [],
path: error.path
};
}
})
);
app.listen(3000, () => {
console.info('Listening on http://localhost:3000/graphql');
});
Further
Further, we need to talk more about the GRAPHQL WAY
.
Say we have several APIs:
GET /post/:id
return the Post data without detailed author dataGET /post/:id/author
return the Author detail according to post idGET /post/:id/all
return the Post data with detailed author dataGET /author/:id
return the Author data without posts listGET /author/:id/posts
return the posts list according to author idGET /author/:id/all
return the Author data with posts list
As you can see, if we face some requirements which need to join multiple types into one API call, it makes RESTful a bit verbose.
Next, let’s see how Graphql handle such case. To make it short, let’s focus on the Author case
.
We have defined Author
in typescript and graphql:
export interface Author {
id: number;
firstName: string;
lastName: string;
posts?: Post[];
}
type Author {
id: Int!
firstName: String
lastName: String
posts: [Post]
}
According to the client’s query:
query { author(id:1) { id firstName lastName } }
same asGET /author/:id
query { author(id:1) { posts { id title votes } } }
same asGET /author/:id/posts
query { author(id:1) { id firstName lastName posts { id title votes } } }
same asGET /author/:id/all
It’s really flexible, right? But we need to see how the backend achieved this. See the underneath posts
resolver belongs to Author
type:
export default {
Author: {
posts: async (source, _args, ctx) => ctx.postDb.findByAuthorId(source.id)
},
};
In addition, have a look at the DB:
export class AuthorDB {
private readonly authors: Author[];
constructor() {
this.authors = [
{ id: 1, firstName: 'Tom', lastName: 'Coleman' },
{ id: 2, firstName: 'Sashko', lastName: 'Stubailo' },
{ id: 3, firstName: 'Mikhail', lastName: 'Novikov' }
] as Author[];
}
public async findById(authorId: number): Promise<Author | undefined> {
return Promise.resolve(find(this.authors, { id: authorId }));
}
}
As you can see the Author data stored in db has no posts related info. It means the Author returned from DB has no posts
attached.
Actually when client added posts
under Author query, the flow is: when the Query.author
request located on server:
- it will first go to
resolvers.Query.author
to retrieve the target Author from DB (which has noposts
) - then it will go to
resolvers.Author.posts
to retrieve the posts accordingly (the sub-resolver will receive Author data assource
as we mentioned above)
Author.posts
resolver means: resolver to handle field posts
of Author type, graphql-tools knows when should go to this resolver.
We have talked about this in Resolvers
chapter, designer need to be very careful, when there is a field added in the schema, it always needs a resolver to return the data accordingly. If the server has this resolver missing, then client will always get null
as a result for this field.
2. Pros Cons
2.1 Pros
Strong Validation
Explicit
Version control Error beforehand when parsing
Flexible
query {
po1: post(id: 1) {
id
title
}
"""
alias
"""
po2: post(id: 2) {
id
title
}
author(id: 1) {
id
title
posts {
id
title
"""
You can make endless loop here, matryoshka 俄罗斯套娃
"""
author
}
}
}
Less HTTP Request
If the backend is designed in THE GRAPHQL WAY
, it could significantly reduce the HTTP requests count between the client and the backend, since most data client need could be put inside one or multiple Grapqhl queries rather than make a lot of HTTP requests to fetch.
Less Bandwidth Consumption
Along with previous point, as in common JWT and cookies are used, the headers are very big and need to be attached to every HTTP request. So if the backend could be designed in THE GRAPHQL WAY
, it can reduce the bandwidth consumption.
Also, the flexibility of Graphql system can help to reduce the consumption of scenario like:
- Client want A.half and B.half
- RESTful need to fetch A and B, then do the filtering get the required fields join them, then the bandwidth is consumed by the size of
A + B
- Graphql only need one query to fetch A.half + B.half, the HTTP request consumption of one API is saved, also the another half piece of A + B are saved too
Enhance Server Performance
If the backend is well-designed in THE GRAPHQL WAY
, means every query contains multiple data fetching requests, this may reduce the DB or Cache IO for duplicate data then reduce the server consumption, enhance the performance.
Enhance Client Performance
Same as backend, significantly reduce the HTTP requests count browser / client would send (if the design is following THE GRAPHQL WAY
).
2.2 Cons
Difficult to read the logs
Graphql request contains the query as a string in the request body, and it would be put in the log as a string inside the json object, and would be escaped like:
{"query":"query {\n complexity(id:1) {\n id\n field1\n field2\n field3\n field4\n field5\n type1 {\n id\n tf11\n tf12\n }\n type2 {\n id\n tf21\n tf22\n }\n additionalField1\n additionalField2\n additionalField3\n }\n}","variables":null}
It’s very difficult to recognize the request (would be even worse for some large query), especially when you need to identify some fields related issues.
Additionally, since the log contains the graphql query string with all the fields in it, the log would be super large for single request, and extremely hard to follow (tail -f, soon drowned in the logs flow).
Difficult to trace
For microservice system, API tracing is a must. For tools like Jaeger, there is no official plugins to support recognizing the graphql query string inside the request body. It could be an issue to decode the graphql query string then pick necessary data and put them in the tracing log.
Difficult to debug
If you want to make a test call by curl, it could be simply done by writing it manually:
$ curl -X GET http://localhost:3000/complexity/1
Have a look at the graphql example below. It’s almost impossible to be done without the help of tool. Even with tools, still very difficult:
- graphql is a string, so it’s difficult to write it manually, especially the query string in json body (escaped by
\n
) - graphql is explicit, it’s very difficult to handle large query, since you will have to list all the fields, you have no way to remember so many fields
$ curl -X POST -d '{"query":"query {\n complexity(id:1) {\n id\n field1\n field2\n field3\n field4\n field5\n type1 {\n id\n tf11\n tf12\n }\n type2 {\n id\n tf21\n tf22\n }\n additionalField1\n additionalField2\n additionalField3\n }\n}","variables":null}' -H 'content-type: application/json' http://localhost:3000/graphql
Difficult to be designed and implemented
For graphql, there is no API (flexible), all you need to do is define the types in *.graphql
, then write resolvers to handle it. So for those nested / joined
models, it’s the responsibility of backend to keep them correctly defined and resolved. (You can find example above, in the Example chapter author.posts
)
The backend API designer need to consider carefully how to put those nested fields / models / types mixed together. Also need to carefully design the Cache system / IO system, since you may query same record multiple times in the same graphql API call, e.g:
- one graphql call contains 2 separated queries inside: post query and author query
- post query retrieved post item id 1 from db
- author query also retrieve post item id 1 from db for
author.posts
field (since this post is written by this author) - means there are two (or more) db io for the same record happened
For those complex system or complex business, the backend of Graphql API system would be difficult to be designed and maintained.
If the Graphql backend is designed like a RESTful API system (Query.getUser, Query.getUserAge, one by one), means it is NOT GRAPHQL WAY
. It will lose the advantages of Graphql but take all the dis-advantages.
Furthermore, for some edge case, there would be some unnecessary IO. See the case of query { author(id:1) { posts { id title votes } } }
, we meant to query posts written by some author, but this query will be located on resolver Query.author
first, means there will be an Author
IO first, though we don’t need this data in the response. This is also an issue of Graphql design, and need to be considered carefully.
Network Quality Sensitive
RESTful is like lots of small HTTP requests. Graphql is like less count, but bigger size HTTP requests.
When HTTP request lost or failed, for RESTful it means there is a small piece broken, but for Graphql it means more.
3. Compare with RESTful
Item | RESTful | Graphql |
---|---|---|
Query Size / One Request | Query content is string, so it’s bigger than RESTful | Smaller |
Query Size / Whole Business | Less requests than RESTful means smaller size in total | More requests than Graphql means more duplicated Headers, bigger size in total |
Tech / Design difficulty | Difficult | Easy |
Developer friendly | No too good | Better |
Performance | According to the design and implementation | Stable |
4. Suggestion
Choose RESTful as the default solution
RESTful has the stable performance, easy to be designed and implemented. Has good tool chain support, more developer friendly. In all the cases, RESTful won’t fail you.
Choose Graphql when facing some specific requirements
Graphql is good at some specific scenario:
- making a flexible API system, like some open API platform (e.g facebook API)
- making a middleware inside a Large system, like a gateway
- if your client changes the data it needs frequently, Graphql will be a good idea.
If you choose to use Graphql, take care of the API design
As we discussed all the above, according to the API design, the Graphql backend implementation will have completely different performance. Need to focus on THE GRAPHQL WAY
, if you think it’s not the correct way to design your Graphql backend, or it’s not what you need. Then you need to think why to choose Graphql, why not RESTful.
EOF