Docs
Launch GraphOS Studio

Aggregating data across subgraphs

If product requirements don't align with a single domain, it may suggest the need for a new domain

federation

A helps you orchestrate data fetches across multiple domains, however it doesn't automatically solve some commonly encountered problems in distributed architectures:

  • Searching across a variety of types stored in multiple s
  • Combining lists of multiple types from multiple sources into a single list, especially when paginated
  • Filtering a list based on attributes defined in various s
  • Deriving data from an aggregate of multiple s

Although it's sometimes possible to generate to support these use cases using the @requires , it's almost always better to provide this functionality in a new system, such as a search index.

Given an to search across both books and movies, we want to return a polymorphic list of books and movies that match the search term.

query SearchEverything($query: String!) {
search(query: $query) {
... on Book {
title
authors {
name
}
}
... on Movie {
title
directors {
name
}
}
}
}

If different define the Book type and the Movie type, the question is: which provides the Query.search root ?

For a given , the resolves each in a single . This in holds true even if multiple subgraphs define a particular field.

Books subgraph
type Query {
search(query: String!): [Product] @shareable
}
interface Product {
title: String
}
type Book implements Product {
title: String
authors: [Person]
}
Movies subgraph
type Query {
search(query: String!): [Product] @shareable
}
interface Product {
title: String
}
type Movie implements Product {
title: String
directors: [Person]
}

The deterministically chooses one to resolve the Query.search . (It calculates all valid for the and chooses the "cheapest" one.)

If the chooses Query.search within the Books , that subgraph can provide only books, not movies.

To resolve this, we could expand the Books to include the Movie definition and add the @key to create a join to the Movies , like so:

type Movie implements Product @key(fields: "id") {
id: ID!
title: String @external
}

Now the Books can return Movie instances, but that means it needs access to the for movie data. This breaks the separation of concerns we rely on to create dividing lines between domains and .

Solution: Create a new subgraph

When product requirements don't fit cleanly into a single domain, it often indicates that we need a new domain. Let's design a system that includes a new search domain. This includes a search index and a Search that provides the Query.search root .

Search domain
reads
writes
writes
Search Index
Search subgraph
Client
Router
Books
Movies
Books DB
Movies DB

This pattern works for all the use cases listed above:

  • Search: A search index (such as Elasticsearch) is the most efficient way to search through a variety of data types and return only the most relevant results.
  • Combining lists: A combined index is the most efficient way to list and paginate through a variety of data types. Fetching multiple lists and combining them on the fly usually means overfetching pages of data and throwing data away when it isn't part of the result.
  • Filtering: A data store can contain indices on various attributes of a variety of data types and efficiently filter results on those attributes.
  • Derived aggregates: Many data stores can efficiently compute a derived value such as AVG(products.rating), or we can write precomputed derived value to a data store.

We can remove the Query.search root from both the Books and Movies and instead add it to our new Search subgraph:

Search subgraph
type Query {
search(query: String!): [Product]
}
interface Product {
id: ID!
title: String
}
type Book implements Product @key(fields: "id") {
id: ID!
title: String @shareable
}
type Movie implements Product @key(fields: "id") {
id: ID!
title: String @shareable
}

The first uses the Search to return a polymorphic list of Books and Movies with their id and title . Then in parallel, it joins with the Books to fetch Book.authors and joins with the Movies to fetch Movie.directors.

Note that the Search provides a minimal set of . We don't need to duplicate the entire Book and Movie types in our Search domain, just the we want to search on.

Tradeoffs

As you'd expect, adding an entirely new domain to your has its tradeoffs:

The question of ownership

In the Search example, we've introduced multiple new services that require ongoing development, maintenance, and support. It doesn't make sense for our existing Books and Movies teams to take on this extra burden. Usually, we want to spin up an entirely new team to hold the pager and deployment keys for our new subgraphs and data stores.

Eventual consistency

Replicating data from the canonical Books and Movies databases into the Search index inevitably involves replication lag, leading to eventually consistent results between our .

How we handle that lag depends on our business requirements. If the results from Query.search must be "internally" consistent, we can denormalize data using @shareable and @provides. We demonstrated this by providing the title from the Search and its backing index.

If the results must be accurate with our canonical , we can make sure the fetches those from their respective . The Book.authors and Movie.directors exemplify that pattern.

Next
Home
Edit on GitHubEditForumsDiscord

© 2024 Apollo Graph Inc.

Privacy Policy

Company