Quick Start

This guide walks you through the process of using Nebula Graph:

  • Install
  • Data Schema
  • CRUD
  • Batch inserting
  • Data Import Tools

Installing Nebula Graph

It is recommended to install Nebula Graph with Docker Compose. We have recorded a quick video tutorial on YouTube for your reference.

In addition to Docker, you can also install Nebula Graph by installing source code or rpm/deb packages.

In case you encounter any problem during the installation, be sure to ask us on our official forum. We have developers on call to answer your questions.

Data Schema

In this guide, all operations are based on the NBA dataset below:

image

In the above figure, there are two tags (player, team) and two edge types (serve and follow).

Creating and Using a Graph Space

A graph space in Nebula Graph is similar to an individual database that you create in traditional databases such as MySQL. First, you need to create a space and use it before can do any other operations.

You can create and use a graph space by the following steps:

  1. Check the cluster machine status:

    nebula> SHOW HOSTS;
    ================================================================================================
    | Ip            | Port  | Status | Leader count | Leader distribution | Partition distribution |
    ================================================================================================
    | 192.168.8.210 | 44500 | online |              |                     |                        |
    ------------------------------------------------------------------------------------------------
    | 192.168.8.211 | 44500 | online |              |                     |                        |
    ------------------------------------------------------------------------------------------------
    

    The status online indicates that the storage service storaged has successfully connected to the metadata service process metad.

  2. Enter the following statement to create a graph space:

    nebula> CREATE SPACE nba(partition_num=10, replica_factor=1);
    

    Here:

    • partition_num specifies the number of partitions in one replica. It is usually 5 times the number of hard disks in the cluster.
    • replica_factor specifies the number of replicas in the cluster. It is usually 3 in production, 1 in test. Due to the majority voting principle, it must set to be odd.

    You can also check machine and partition distribution with SHOW HOSTS statement:

    nebula> SHOW HOSTS;
    ================================================================================================
    | Ip            | Port  | Status | Leader count | Leader distribution | Partition distribution |
    ================================================================================================
    | 192.168.8.210 | 44500 | online | 8            | nba: 8              | test: 8                |
    ------------------------------------------------------------------------------------------------
    | 192.168.8.211 | 44500 | online | 2            | nba: 2              | test: 2                |
    ------------------------------------------------------------------------------------------------
    

    If all the machines are online, but the leader distribution is unbalanced (see above), re-distribute partitions with BALANCE LEADER statement:

    nebula> BALANCE LEADER;
    ================================================================================================
    | Ip            | Port  | Status | Leader count | Leader distribution | Partition distribution |
    ================================================================================================
    | 192.168.8.210 | 44500 | online | 5            | nba: 5              | test: 5                |
    ------------------------------------------------------------------------------------------------
    | 192.168.8.211 | 44500 | online | 5            | nba: 5              | test: 5                |
    ------------------------------------------------------------------------------------------------
    

    See here for details.

  3. Enter the following statement to use the graph space:

    nebula> USE nba;
    
  4. Now you can check the graph space you just created:

    nebula> SHOW SPACES;
    

    The following information is returned:

    ========
    | Name |
    ========
    | nba  |
    --------
    

Defining the Schema for Your Data

In Nebula Graph, we classify different vertices with similar properties into one group which is named a tag. The CREATE TAG statement defines a tag with a tag name followed by properties and the property types enclosed in parentheses. The CREATE EDGE statement defines an edge type with a type name, followed by properties and the property types enclosed in parentheses.

You can create tags and edge types by the following steps:

  1. Enter the following statement to create the player tag:

    nebula> CREATE TAG player(name string, age int);
    
  2. Enter the following statement to create the team tag:

    nebula> CREATE TAG team(name string);
    
  3. Enter the following statement to create the follow edge type:

    nebula> CREATE EDGE follow(degree int);
    
  4. Enter the following statement to create the serve edge type:

    nebula> CREATE EDGE serve(start_year int, end_year int);
    
  5. Now you can check the tags and edge types you just created.

    5.1 To show the tags you just created, enter the following statement:

    nebula> SHOW TAGS;
    

    The following information is returned:

    ============
    | Name     |
    ============
    | player   |
    ------------
    | team     |
    ------------
    

    5.2 To show the edge types you just created, enter the following statement:

    nebula> SHOW EDGES;
    

    The following information is returned:

    ==========
    | Name   |
    ==========
    | serve  |
    ----------
    | follow |
    ----------
    

    5.3 To show the properties of the player tag, enter the following statement:

    nebula> DESCRIBE TAG player;
    

    The following information is returned:

    ===================
    | Field  | Type   |
    ===================
    | name   | string |
    -------------------
    | age    | int    |
    -------------------
    

    5.4 To show the properties of the follow edge type, enter the following statement:

    nebula> DESCRIBE EDGE follow;
    

    The following information is returned:

    =====================
    | Field    | Type   |
    =====================
    | degree   | int    |
    ---------------------
    

CRUD

Inserting Data

You can insert vertices and edges based on relations in the illustration figure.

Inserting Vertices

The INSERT VERTEX statement inserts a vertex by specifying the vertex tag, properties, vertex ID and property values.

You can insert some vertices by the following statements:

nebula> INSERT VERTEX player(name, age) VALUES 100:("Tim Duncan", 42);
nebula> INSERT VERTEX player(name, age) VALUES 101:("Tony Parker", 36);
nebula> INSERT VERTEX player(name, age) VALUES 102:("LaMarcus Aldridge", 33);
nebula> INSERT VERTEX team(name) VALUES 200:("Warriors");
nebula> INSERT VERTEX team(name) VALUES 201:("Nuggets");
nebula> INSERT VERTEX player(name, age) VALUES 121:("Useless", 60);
  • In the above vertices inserted, the number after the keyword VALUES is the vertex ID (abbreviated for VID, int64). The VID must be unique in the space.
  • The last vertex (VID: 121)inserted will be deleted in the deleting data section.
  • If you want to insert multiple vertices for the same tag by a single INSERT VERTEX operation, you can enter the following statement:
nebula> INSERT VERTEX player(name, age) VALUES 100:("Tim Duncan", 42), \
101:("Tony Parker", 36), 102:("LaMarcus Aldridge", 33);

Inserting Edges

The INSERT EDGE statement inserts an edge by specifying the edge type name, properties, source vertex VID and target vertex VID, and property values.

You can insert some edges by the following statements:

nebula> INSERT EDGE follow(degree) VALUES 100 -> 101:(95);
nebula> INSERT EDGE follow(degree) VALUES 100 -> 102:(90);
nebula> INSERT EDGE follow(degree) VALUES 102 -> 101:(75);
nebula> INSERT EDGE serve(start_year, end_year) VALUES 100 -> 200:(1997, 2016);
nebula> INSERT EDGE serve(start_year, end_year) VALUES 101 -> 201:(1999, 2018);

Similarly: If you want to insert multiple edges for the same edge type by a single INSERT EDGES operation, you can enter the following statement:

nebula> INSERT EDGE follow(degree) VALUES 100 -> 101:(95),100 -> 102:(90),102 -> 101:(75);

Fetching Data

After you insert some data in Nebula Graph, you can retrieve any data from your graph space.

The FETCH PROP ON statement retrieve data from your graph space. If you want to fetch vertex data, you must specify the vertex tag and vertex VID; if you want to fetch edge data, you must specify the edge type name, source vertex VID and target vertex VID.

To fetch the data of the player whose VID is 100, enter the following statement:

nebula> FETCH PROP ON player 100;

The following information is returned:

=======================================
| VertexID | player.name | player.age |
=======================================
| 100      | Tim Duncan  | 42         |
---------------------------------------

To fetch the data of the serve edge between VID 100 and VID 200, enter the following statement:

nebula> FETCH PROP ON serve 100 -> 200;

The following information is returned:

=============================================================================
| serve._src | serve._dst | serve._rank | serve.start_year | serve.end_year |
=============================================================================
| 100        | 200        | 0           | 1997             | 2016           |
-----------------------------------------------------------------------------

Updating Data

You can update the vertices and edges you just inserted.

Updating Vertices

The UPDATE VERTEX statement updates data for your vertex by selecting the vertex that you want to update and then setting the property value with an equal sign to assign it a new value.

The following example shows you how to change the name value of VID 100 from Tim Duncan to Tim.

Enter the following statement to update the name value:

nebula> UPDATE VERTEX 100 SET player.name = "Tim";

To check whether the name value is updated, enter the following statement:

nebula> FETCH PROP ON player 100;

The following information is displayed:

=======================================
| VertexID | player.name | player.age |
=======================================
| 100      | Tim         | 42         |
---------------------------------------

Updating Edges

The UPDATE EDGE statement updates data for your edge by specifying the source vertex ID and the target vertex ID of the edge and then setting the property value with an equal sign to assign it a new value.

The following example shows you how to change the value of the degree property in the follow edge between VID 100 and VID 101. Now we change the degree property from 95 to 96.

Now we add the degree value by 1:

nebula> UPDATE EDGE 100 -> 101 OF follow SET degree = 96;

To check whether the degree value is updated, enter the following statement:

nebula> FETCH PROP ON follow 100 -> 101;

The following information is returned:

============================================================
| follow._src | follow._dst | follow._rank | follow.degree |
============================================================
| 100         | 101         | 0            | 96            |
------------------------------------------------------------

UPSERT

UPSERT is used to insert a new vertex or edge or update an existing one. If the vertex or edge doesn’t exist it will be created. UPSERT is a combination of INSERT and UPDATE.

Consider the following example:

nebula> INSERT VERTEX player(name, age) VALUES 111:("Ben Simmons", 22); -- Insert a new vertex.
nebula> UPSERT VERTEX 111 SET player.name = "Dwight Howard", player.age = $^.player.age + 11 WHEN $^.player.name == "Ben Simmons" && $^.player.age > 20 YIELD $^.player.name AS Name, $^.player.age AS Age; -- Do upsert on the vertex.
=======================
| Name          | Age |
=======================
| Dwight Howard | 33  |
-----------------------

See UPSERT Doc for details.

Deleting Data

If you have some data that you do not need, you can delete it from your graph space.

Deleting Vertices

You can delete any vertex from your graph space. The DELETE VERTEX statement deletes a vertex by specifying the vertex VID.

To delete a vertex whose VID is 121, enter the following statement:

nebula> DELETE VERTEX 121;

To check whether the vertex is deleted, enter the following statement;

nebula> FETCH PROP ON player 121;

The following information is returned:

Execution succeeded (Time spent: 1571/1910 us)

The above information with an empty return result indicates the query operation is successful but no data is queried from your graph space because the data is deleted.

Deleting Edges

You can delete any edge from your graph space. The DELETE EDGE statement deletes an edge by specifying the edge type name and the source vertex ID and target vertex ID.

To delete a follow edge between VID 100 and VID 200, enter the following statement:

nebula> DELETE EDGE follow 100 -> 200;

NOTE: If you delete a vertex, all the out-going and in-coming edges of this vertex are deleted.

Creating Index

You can add indexes to existing tag/edge-type with the CREATE INDEX keyword.

nebula> CREATE TAG INDEX player_index_0 on player(name);

The above statement creates an index for the name property on all vertices carrying the player tag.

nebula> REBUILD TAG INDEX player_index_0 OFFLINE;

Creating index will affect the write performance. We suggest you import data first and then rebuild the index in batch.

See the Index Documentation for details.

Sample Queries

This section gives more query examples for your reference.

Example 1

Find the vertices that VID 100 follows.

Enter the following statement:

nebula> GO FROM 100 OVER follow;

The following information is returned:

===============
| follow._dst |
===============
| 101         |
---------------
| 102         |
---------------

Example 2

Find the vertex that VID 100 follows, whose age is greater than 35. Return his name and age, and set the column names to Teammate and Age respectively.

Enter the following statement:

nebula> GO FROM 100 OVER follow WHERE $$.player.age >= 35 \
YIELD $$.player.name AS Teammate, $$.player.age AS Age;

The following information is returned:

=====================
| Teammate    | Age |
=====================
| Tony Parker | 36  |
---------------------

Here:

  • YIELD specifies what values or results you want to return from the query.
  • $$ represents the target vertex.
  • \ represents a line break.

Example 3

Find the team which is served by the player who is followed by 100.

There are two ways to get the same result. First, we can use a pipe to retrieve the team. Then we use a temporary variable to retrieve the same team.

  1. Enter the following statement with a pipe:

    nebula> GO FROM 100 OVER follow YIELD follow._dst AS id | \
            GO FROM $-.id OVER serve YIELD $$.team.name AS Team, \
            $^.player.name AS Player;
    

    The following information is returned.

    ===============================
    | Team    | Player            |
    ===============================
    | Nuggets | Tony Parker       |
    -------------------------------
    

    Here:

    $^ indicates the source vertex of the edge.

    | indicates the pipe.

    $- indicates the input flow. The output of the previous query (id) is used as the input of the next query ($-.id).

  2. Enter the following statement with a temporary variable:

    nebula> $var = GO FROM 100 OVER follow YIELD follow._dst AS id; \
            GO FROM $var.id OVER serve YIELD $$.team.name AS Team, \
            $^.player.name AS Player;
    

    The following information is returned.

    ===============================
    | Team    | Player            |
    ===============================
    | Nuggets | Tony Parker       |
    -------------------------------
    

When a composition statement is submitted to the server as a whole, the custom variables in the statement have ended their life cycle.

Example 4

If you have created and rebuilt an index for the data, you can use the LOOKUP ON keyword to query properties.

nebula> CREATE TAG INDEX player_index_0 on player(name);
nebula> REBUILD TAG INDEX player_index_0 OFFLINE;

Refer to the Index Doc to create an index.

The following example returns vertices whose name is Tony Parker and tagged with player.

nebula> LOOKUP ON player WHERE player.name == "Tony Parker" \
YIELD player.name, player.age;
=======================================
| VertexID | player.name | player.age |
=======================================
| 101      | Tony Parker | 36         |
---------------------------------------

See LOOKUP Documentation for details.

Example 5

The SUBMIT JOB COMPACT command triggers the long-term RocksDB compact operation. There are a lot of IO operations during the running of this command. After the command is executed successfully (the job status is stopped, see the example below), the read performance is improved.

For example:

nebula> SUBMIT JOB COMPACT;
==============
| New Job Id |
==============
| 40         |
--------------

The SHOW JOBS statement lists all the jobs that are not expired. For example:

nebula> SHOW JOBS;
=============================================================================
| Job Id | Command       | Status   | Start Time        | Stop Time         |
=============================================================================
| 22     | flush test2   | failed   | 12/06/19 14:46:22 | 12/06/19 14:46:22 |
-----------------------------------------------------------------------------
| 23     | compact test2 | stopped  | 12/06/19 15:07:09 | 12/06/19 15:07:33 |
-----------------------------------------------------------------------------
| 24     | compact test2 | stopped  | 12/06/19 15:07:11 | 12/06/19 15:07:20 |
-----------------------------------------------------------------------------
| 25     | compact test2 | stopped  | 12/06/19 15:07:13 | 12/06/19 15:07:24 |
-----------------------------------------------------------------------------

For more information about the COMPACT and JOB statements, refer to Job manager.

Batch Inserting

To insert multiple data, you can put all the DDL (Data Definition Language) statements in a .ngql file as follows.

CREATE SPACE nba(partition_num=10, replica_factor=1);
USE nba;
CREATE TAG player(name string, age int);
CREATE TAG team(name string);
CREATE EDGE follow(degree int);
CREATE EDGE serve(start_year int, end_year int);
  • If you install Nebula Graph by compiling the source code, you can batch write to console by the following command:
$ cat schema.ngql | ./bin/nebula -u root -p nebula
  • If you are using Nebula Graph by docker-compose, you can batch write to console by the following command:
$ cat nba.ngql | sudo docker run --rm -i --network=host \
vesoft/nebula-console:nightly --addr=127.0.0.1 --port=3699

Here:

  • You must change the IP address and the port number to yours.
  • You can download the nba.ngql file here.

Next to do

Before you start using Nebula Graph, make sure that you read the following documents. They include most of your questions about Nebula Graph. If you still have questions after reading these documents, please head over to our official forum to ask.

Again, if you come across any problem following the steps in this guide, please head over to our official forum and our on-call developers are more than happy to answer your questions!

Finish the steps in this guide and feel Nebula Graph is good? Please star us on GitHub and make our day!