Design and Architecture of Nebula Graph

This document is to walk you through on how Nebula Graph is designed and why, and it will be separated into two parts: Industry Overview and Nebula Graph Architecture.

Industry Overview

OLAP & OLTP in Graph

meetup1-04

The axis in the above picture shows the different requirements for query latency. Like a traditional database, graph database can be divided into two parts: OLAP and OLTP. OLAP cares more about offline analysis while OLTP prefers online processing. Graph computing framework in OLAP is used to analyse data based on graph structure. And it's similar to OLAP in traditional database. But it has features which are not available in traditional database, one is iterative algorithm based on graph. A typical example is the PageRank algorithm from Google, which obtains the relevance of web pages through constant iterative computing. Another example is the commonly-used LPA algorithm.

Along the axis to right, there comes the graph streaming field, which is the combination of basic computing and streaming computing. A relational network is not a static structure, rather, it constantly changes in the business: be it graph structure or graph properties. Computing in this filed is often triggered by events and its latency is in second.

Right beside the graph streaming is the online response system, whose requirement for latency is extremely high, which should be in millisecond.

The rightmost field is graph database and its use cases are completely different from the one on the left. The Nebula Graph we're working on can find its usage here in OLTP.

Native Vs Multi-Model

Graph databases can be classified into two kinds: native graph database and multi-model graph database. Using a graph first design, native graph databases are specifically optimized in storage and processing, thus they tend to perform queries faster, scale bigger and run more efficiently, calling for much less hardware at the same time. As for multi-model products, their storage comes from an outside source, such as a relational, columnar, or other NoSQL database. These databases use other algorithms to store data about vertices and edges and can lead to latent results as their storage layer is not optimized for graphs.

Data Stored in Graph Database

In graph database, data is stored as graph. Modelling data as graph is natural, and has the nice benefit of staying legible even by non-technical people. The data model handled by Nebula Graph is directed property graph, whose edges are directional and there could be properties on both edges and vertices. It can be represented as:

G = < V, E, PV, PE >

Here V is a set of nodes, aka vertices, E is a set of directional edges, PV represents properties on vertices, and PE is the properties on edges.

Nebula Graph Architecture

Designed based on the above features, Nebula Graph is an open source, distributed, lightning-fast graph database, it is composed of four components: storage service, meta service, query engine and client.

meetup1-13

The dashed line in the above picture divided computing and storage as two independent parts, the upper is the computing service, each machine or virtual machine is stateless and never talks to other so it's easy to scale in or out; the lower is the storage service, it's stateful since data is stored there. Storage service can turn graph semantics into key-values and pass them to the KV-store below it. Between the two is the Raft protocol.

The right side is the meta service, similar to the NameNode in HDFS, it stores all metadata like schema and controls scaling.

Design Thinking: Storage Service

Nebula Graph adopted the shared-nothing distributed architecture in storage so nodes do not share memory or storage, which means there are no central nodes in the whole system. Benefits of such design are:

  • Easy to scale
  • The overall system continues operating despite individual crash

Another design is the separation of computing and storage, and the benefits are as follows:

  • Scalability. Separating storage from computing makes storage service flexible, thus it's easy to scale out or in.
  • Availability. Recovery from vertex failure can be performed quickly.

The binary of storage service is nebula-storaged, which provides a key-value store. Multiple storage engines like RocksDB and HBase are supported, with RocksDB set as the default engine. To build a resilient distributed system, Raft is implemented as the consensus algorithm.

Raft achieves data consensus via an elected leader. Based on that, nebula-storaged makes the following optimizations:

  • Parallel Raft

Partitions of the same ID from multiple machines form a raft group. And the parallel operations are implemented with multiple sets of Raft groups.

  • Write Path & batch

In Raft protocol, the master replicates log entries to all the followers and commits the entries in order. To improve the write throughput, Nebula Graph not only employs the parallel raft, but also implements the dynamic batch replication.

  • Load-balance

Migrating the partitions on an overworked server to other relatively idle servers to increases availability and capacity of the system.

Design-Thinking: Meta Service

The binary of the meta service is nebula-metad. Here is the list of its primary functionalities:

  • User management

    In Nebula Graph different roles are assigned diverse privileges. We provide the following native roles: Global Admin, Graph Space Admin, User and Guest. - Cluster configuration management

    Meta service manages the servers and partitions in the cluster, e.g. records location of the partitions, receives heartbeat from servers, etc. It balances the partitions and manages the communication traffic in case of server failure. - Graph space management

    Nebula Graph supports multiple graph spaces. Data in different graph spaces are physically isolated. Meta service stores the metadata of all spaces in the cluster and tracks changes that take place in these spaces, like adding, dropping space, modifying graph space configuration (Raft copies). - Schema management

    Nebula Graph is a strong typed database.

    • Types of tag and edge properties are recorded by meta service. Supported data types are: int, double, timestamp, list, etc.
    • Multi-version management, supporting adding, modifying and deleting schema, and recording its version.
    • TTL (time-to-live) management, supporting automatic data deletion and space reclamation. The meta service is stateful, and just like the storage service, it persists data to a key-value store.

Design-Thinking: Query Engine

Nebula Graph's query language nGQL is a SQL-like descriptive language rather than an imperative one. It's compossible but not embeddable, it uses Shell pipe as an alternative, aka output in the former query acts as the input in the latter one. Key features of nGQL are as follows:

  • Primary algorithms are built in the query engine
  • Duplicate queries can be avoided by supporting user-defined function (UDF)
  • Programmable

The binary of the query engine is nebula-graphd. Each nebula-graphd instance is stateless and never talks to other nebula-graphd. nebula-graphd only talks to the storage service and the meta service. That makes it trivial to expand or shrink the query engine cluster.

The query engine accepts the message from the client and generates the execution plan after the lexical parsing (Lexer), semantic analysis (Parser) and the query optimization. Then the execution plan will be passed to the execution engine. The query execution engine takes the query plans and interacts with meta server and the storage engine to retrieve the schema and data.

The primary optimizations of the query engine are:

  • Asynchronous and parallel execution

    I/O operations and network transmission are time-consuming. Thus asynchronous and parallel operations are widely adopted in the query engine to reduce the latency and to improve the overall throughput. Also, a separate resource pool is set for each query to avoid the long-tail effect of those time-consuming queries.

  • Pushing down computation

    In a distributed system, transferring a large amount of data on the network really extends the overall latency. In Nebula Graph, the query engine will make decisions to push some filter and aggregation down to the storage service. The purpose is to reduce the amount of data passing back from the storage.

Design-Thinking: API and SDK

Nebula Graph provides SDKs in C++, Java, and Golang. Nebula Graph uses fbthrift as the RPC framework to communicate among servers. Nebula Graph's web console is in progress and will be released soon.