Skip to content

Step 4: Use nGQL (CRUD)

This topic will describe the basic CRUD operations in NebulaGraph.

For more information, see nGQL guide.

Graph space and NebulaGraph schema

A NebulaGraph instance consists of one or more graph spaces. Graph spaces are physically isolated from each other. You can use different graph spaces in the same instance to store different datasets.

NebulaGraph and graph spaces

To insert data into a graph space, define a schema for the graph database. NebulaGraph schema is based on the following components.

Schema component Description
Vertex Represents an entity in the real world. A vertex can have one or more tags.
Tag The type of the same group of vertices. It defines a set of properties that describes the types of vertices.
Edge Represents a directed relationship between two vertices.
Edge type The type of an edge. It defines a group of properties that describes the types of edges.

For more information, see Data modeling.

In this topic, we will use the following dataset to demonstrate basic CRUD operations.

The demo dataset

Check the machine status in the NebulaGraph cluster

Note

First, we recommend that you check the machine status to make sure that all the Storage services are connected to the Meta services. Run SHOW HOSTS as follows.

nebula> SHOW HOSTS;
+-------------+-----------+-----------+--------------+----------------------+------------------------+
| Host        | Port      | Status    | Leader count | Leader distribution  | Partition distribution |
+-------------+-----------+-----------+--------------+----------------------+------------------------+
| "storaged0" | 9779      | "ONLINE"  | 0            | "No valid partition" | "No valid partition"   |
| "storaged1" | 9779      | "ONLINE"  | 0            | "No valid partition" | "No valid partition"   |
| "storaged2" | 9779      | "ONLINE"  | 0            | "No valid partition" | "No valid partition"   |
| "Total"     | __EMPTY__ | __EMPTY__ | 0            | __EMPTY__            | __EMPTY__              |
+-------------+-----------+-----------+--------------+----------------------+------------------------+

From the Status column of the table in the return results, you can see that all the Storage services are online.

Asynchronous implementation of creation and alteration

Caution

In NebulaGraph, the following CREATE or ALTER commands are implemented in an async way and take effect in the next heartbeat cycle. Otherwise, an error will be returned. To make sure the follow-up operations work as expected, Wait for two heartbeat cycles, i.e., 20 seconds.

  • CREATE SPACE
  • CREATE TAG
  • CREATE EDGE
  • ALTER TAG
  • ALTER EDGE
  • CREATE TAG INDEX
  • CREATE EDGE INDEX

Note

The default heartbeat interval is 10 seconds. To change the heartbeat interval, modify the heartbeat_interval_secs parameter in the configuration files for all services.

Create and use a graph space

nGQL syntax

  • Create a graph space:
    CREATE SPACE [IF NOT EXISTS] <graph_space_name> (
    [partition_num = <partition_number>,]
    [replica_factor = <replica_number>,]
    vid_type = {FIXED_STRING(<N>) | INT64}
    )
    [COMMENT = '<comment>'];
    

    For more information on parameters, see CREATE SPACE.

  • List graph spaces and check if the creation is successful:
    nebula> SHOW SPACES;
    
  • Use a graph space:
    USE <graph_space_name>;
    

Examples

  1. Use the following statement to create a graph space named basketballplayer.

    nebula> CREATE SPACE basketballplayer(partition_num=15, replica_factor=1, vid_type=fixed_string(30));
    
  2. Check the partition distribution with SHOW HOSTS to make sure that the partitions are distributed in a balanced way.

    nebula> SHOW HOSTS;
    +-------------+-----------+-----------+--------------+----------------------------------+------------------------+
    | Host        | Port      | Status    | Leader count | Leader distribution              | Partition distribution |
    +-------------+-----------+-----------+--------------+----------------------------------+------------------------+
    | "storaged0" | 9779      | "ONLINE"  | 5            | "basketballplayer:5"             | "basketballplayer:5"   |
    | "storaged1" | 9779      | "ONLINE"  | 5            | "basketballplayer:5"             | "basketballplayer:5"   |
    | "storaged2" | 9779      | "ONLINE"  | 5            | "basketballplayer:5"             | "basketballplayer:5"   |
    | "Total"     |           |           | 15           | "basketballplayer:15"            | "basketballplayer:15"  |
    +-------------+-----------+-----------+--------------+----------------------------------+------------------------+
    

    If the Leader distribution is uneven, use BALANCE LEADER to redistribute the partitions. For more information, see BALANCE.

  3. Use the basketballplayer graph space.

    nebula[(none)]> USE basketballplayer;
    

    You can use SHOW SPACES to check the graph space you created.

    nebula> SHOW SPACES;
    +--------------------+
    | Name               |
    +--------------------+
    | "basketballplayer" |
    +--------------------+
    

Create tags and edge types

nGQL syntax

CREATE {TAG | EDGE} {<tag_name> | <edge_type>}(<property_name> <data_type>
[, <property_name> <data_type> ...])
[COMMENT = '<comment>'];

For more information on parameters, see CREATE TAG and CREATE EDGE.

Examples

Create tags player and team, edge types follow and serve. Descriptions are as follows.

Component name Type Property
player Tag name (string), age (int)
team Tag name (string)
follow Edge type degree (int)
serve Edge type start_year (int), end_year (int)
nebula> CREATE TAG player(name string, age int);

nebula> CREATE TAG team(name string);

nebula> CREATE EDGE follow(degree int);

nebula> CREATE EDGE serve(start_year int, end_year int);

Insert vertices and edges

Users can use the INSERT statement to insert vertices or edges based on existing tags or edge types.

nGQL syntax

  • Insert vertices:
    INSERT VERTEX [IF NOT EXISTS] <tag_name> (<property_name>[, <property_name>...])
    [, <tag_name> (<property_name>[, <property_name>...]), ...]
    {VALUES | VALUE} <vid>: (<property_value>[, <property_value>...])
    [, <vid>: (<property_value>[, <property_value>...];
    

    VID is short for Vertex ID. A VID must be a unique string value in a graph space. For details, see INSERT VERTEX.

  • Insert edges:

    INSERT EDGE [IF NOT EXISTS] <edge_type> (<property_name>[, <property_name>...])
    {VALUES | VALUE} <src_vid> -> <dst_vid>[@<rank>] : (<property_value>[, <property_value>...])
    [, <src_vid> -> <dst_vid>[@<rank>] : (<property_name>[, <property_name>...]), ...];
    

    For more information on parameters, see INSERT EDGE.

Examples

  • Insert vertices representing basketball players and teams:
    nebula> INSERT VERTEX player(name, age) VALUES "player100":("Tim Duncan", 42);
    
    nebula> INSERT VERTEX player(name, age) VALUES "player101":("Tony Parker", 36);
    
    nebula> INSERT VERTEX player(name, age) VALUES "player102":("LaMarcus Aldridge", 33);
    
    nebula> INSERT VERTEX team(name) VALUES "team203":("Trail Blazers"), "team204":("Spurs");
    
  • Insert edges representing the relations between basketball players and teams:
    nebula> INSERT EDGE follow(degree) VALUES "player101" -> "player100":(95);
    
    nebula> INSERT EDGE follow(degree) VALUES "player101" -> "player102":(90);
    
    nebula> INSERT EDGE follow(degree) VALUES "player102" -> "player100":(75);
    
    nebula> INSERT EDGE serve(start_year, end_year) VALUES "player101" -> "team204":(1999, 2018),"player102" -> "team203":(2006,  2015);
    

Read data

  • The GO statement can traverse the database based on specific conditions. A GO traversal starts from one or more vertices, along one or more edges, and returns information in a form specified in the YIELD clause.
  • The FETCH statement is used to get properties from vertices or edges.
  • The LOOKUP statement is based on indexes. It is used together with the WHERE clause to search for the data that meet the specific conditions.
  • The MATCH statement is the most commonly used statement for graph data querying. It can describe all kinds of graph patterns, but it relies on indexes to match data patterns in NebulaGraph. Therefore, its performance still needs optimization.

nGQL syntax

  • GO
    GO [[<M> TO] <N> STEPS ] FROM <vertex_list>
    OVER <edge_type_list> [{REVERSELY | BIDIRECT}]
    [ WHERE <conditions> ]
    [YIELD [DISTINCT] <return_list>]
    [{SAMPLE <sample_list> | LIMIT <limit_list>}]
    [| GROUP BY {col_name | expr | position} YIELD <col_name>]
    [| ORDER BY <expression> [{ASC | DESC}]]
    [| LIMIT [<offset>,] <number_rows>];
    
  • FETCH

    • Fetch properties on tags:

      FETCH PROP ON {<tag_name>[, tag_name ...] | *}
      <vid> [, vid ...]
      [YIELD <return_list> [AS <alias>]];
      
    • Fetch properties on edges:

      FETCH PROP ON <edge_type> <src_vid> -> <dst_vid>[@<rank>] [, <src_vid> -> <dst_vid> ...]
      [YIELD <output>];
      
  • LOOKUP
    LOOKUP ON {<vertex_tag> | <edge_type>}
    [WHERE <expression> [AND <expression> ...]]
    [YIELD <return_list> [AS <alias>]];
    
  • MATCH
    MATCH <pattern> [<WHERE clause>] RETURN <output>;
    

Examples of GO statement

  • Search for the players that the player with VID player101 follows.
    nebula> GO FROM "player101" OVER follow;
    +-------------+
    | follow._dst |
    +-------------+
    | "player100" |
    | "player102" |
    +-------------+
    
  • Filter the players that the player with VID player101 follows whose age is equal to or greater than 35. Rename the corresponding columns in the results with Teammate and Age.
    nebula> GO FROM "player101" OVER follow WHERE $$.player.age >= 35 \
            YIELD properties($$).name AS Teammate, properties($$).age AS Age;
    +--------------+-----+
    | Teammate     | Age |
    +--------------+-----+
    | "Tim Duncan" | 42  |
    +--------------+-----+
    

    | Clause/Sign | Description | |-------------+---------------------------------------------------------------------| | YIELD | Specifies what values or results you want to return from the query. | | $$ | Represents the target vertices. | | \ | A line-breaker. |

  • Search for the players that the player with VID player101 follows. Then Retrieve the teams of the players that the player with VID player100 follows. To combine the two queries, use a pipe or a temporary variable.

    • With a pipe:

      nebula> GO FROM "player101" OVER follow YIELD dst(edge) AS id | \
              GO FROM $-.id OVER serve YIELD properties($$).name AS Team, \
              properties($^).name AS Player;
      +-----------------+---------------------+
      | Team            | Player              |
      +-----------------+---------------------+
      | "Trail Blazers" | "LaMarcus Aldridge" |
      +-----------------+---------------------+
      
      Clause/Sign Description
      $^ Represents the source vertex of the edge.
      | A pipe symbol can combine multiple queries.
      $- Represents the outputs of the query before the pipe symbol.
    • With a temporary variable:

      Note

      Once a composite statement is submitted to the server as a whole, the life cycle of the temporary variables in the statement ends.

      nebula> $var = GO FROM "player101" OVER follow YIELD dst(edge) AS id; \
              GO FROM $var.id OVER serve YIELD properties($$).name AS Team, \
              properties($^).name AS Player;
      +-----------------+---------------------+
      | Team            | Player              |
      +-----------------+---------------------+
      | "Trail Blazers" | "LaMarcus Aldridge" |
      +-----------------+---------------------+
      

Example of FETCH statement

Use FETCH: Fetch the properties of the player with VID player100.

nebula> FETCH PROP ON player "player100";
+----------------------------------------------------+
| vertices_                                          |
+----------------------------------------------------+
| ("player100" :player{age: 42, name: "Tim Duncan"}) |
+----------------------------------------------------+

Note

The examples of LOOKUP and MATCH statements are in indexes.

Update vertices and edges

Users can use the UPDATE or the UPSERT statements to update existing data.

UPSERT is the combination of UPDATE and INSERT. If you update a vertex or an edge with UPSERT, the database will insert a new vertex or edge if it does not exist.

Note

UPSERT operates serially in a partition-based order. Therefore, it is slower than INSERT OR UPDATE. And UPSERT has concurrency only between multiple partitions.

nGQL syntax

  • UPDATE vertices:
    UPDATE VERTEX <vid> SET <properties to be updated>
    [WHEN <condition>] [YIELD <columns>];
    
  • UPDATE edges:
    UPDATE EDGE <source vid> -> <destination vid> [@rank] OF <edge_type>
    SET <properties to be updated> [WHEN <condition>] [YIELD <columns to be output>];
    
  • UPSERT vertices or edges:
    UPSERT {VERTEX <vid> | EDGE <edge_type>} SET <update_columns>
    [WHEN <condition>] [YIELD <columns>];
    

Examples

  • UPDATE the name property of the vertex with VID player100 and check the result with the FETCH statement.
    nebula> UPDATE VERTEX "player100" SET player.name = "Tim";
    
    nebula> FETCH PROP ON player "player100";
    +---------------------------------------------+
    | vertices_                                   |
    +---------------------------------------------+
    | ("player100" :player{age: 42, name: "Tim"}) |
    +---------------------------------------------+
    
  • UPDATE the degree property of an edge and check the result with the FETCH statement.
    nebula> UPDATE EDGE "player101" -> "player100" OF follow SET degree = 96;
    
    nebula> FETCH PROP ON follow "player101" -> "player100";
    +----------------------------------------------------+
    | edges_                                             |
    +----------------------------------------------------+
    | [:follow "player101"->"player100" @0 {degree: 96}] |
    +----------------------------------------------------+
    
  • Insert a vertex with VID player111 and UPSERT it.
    nebula> INSERT VERTEX player(name,age) values "player111":("David West", 38);
    
    nebula> UPSERT VERTEX "player111" SET player.name = "David", player.age = $^.player.age + 11 \
            WHEN $^.player.name == "David West" AND $^.player.age > 20 \
            YIELD $^.player.name AS Name, $^.player.age AS Age;
    +---------+-----+
    | Name    | Age |
    +---------+-----+
    | "David" | 49  |
    +---------+-----+
    

Delete vertices and edges

nGQL syntax

  • Delete vertices:
    DELETE VERTEX <vid1>[, <vid2>...]
    
  • Delete edges:
    DELETE EDGE <edge_type> <src_vid> -> <dst_vid>[@<rank>]
    [, <src_vid> -> <dst_vid>...]
    

Examples

  • Delete vertices:
    nebula> DELETE VERTEX "player111", "team203";
    
  • Delete edges:
    nebula> DELETE EDGE follow "player101" -> "team204";
    

About indexes

Users can add indexes to tags and edge types with the CREATE INDEX statement.

Must-read for using indexes

Both MATCH and LOOKUP statements depend on the indexes. But indexes can dramatically reduce the write performance (as much as 90% or even more). DO NOT use indexes in production environments unless you are fully aware of their influences on your service.

Users MUST rebuild indexes for pre-existing data. Otherwise, the pre-existing data cannot be indexed and therefore cannot be returned in MATCH or LOOKUP statements. For more information, see REBUILD INDEX.

nGQL syntax

  • Create an index:
    CREATE {TAG | EDGE} INDEX [IF NOT EXISTS] <index_name>
    ON {<tag_name> | <edge_name>} ([<prop_name_list>]) [COMMENT = '<comment>'];
    
  • Rebuild an index:
    REBUILD {TAG | EDGE} INDEX <index_name>;
    

Note

Define the index length when creating an index for a variable-length property. In UTF-8 encoding, a non-ascii character occupies 3 bytes. You should set an appropriate index length according to the variable-length property. For example, the index should be 30 bytes for 10 non-ascii characters. For more information, see CREATE INDEX

Examples of LOOKUP and MATCH (index-based)

Make sure there is an index for LOOKUP or MATCH to use. If there is not, create an index first.

Find the information of the vertex with the tag player and its value of the name property is Tony Parker.

This example creates the index player_index_1 on the player name property.

nebula> CREATE TAG INDEX player_index_1 ON player(name(20));

This example rebuilds the index to make sure it takes effect on pre-existing data.

nebula> REBUILD TAG INDEX player_index_1
+------------+
| New Job Id |
+------------+
| 31         |
+------------+
Got 1 rows (time spent 2379/3033 us)

This example uses the LOOKUP statement to retrieve the vertex property.

nebula> LOOKUP ON player WHERE player.name == "Tony Parker" \
        YIELD properties(vertex).name AS name, properties(vertex).age AS age;
+-------------+---------------+-----+
| VertexID    | name          | age |
+-------------+---------------+-----+
| "player101" | "Tony Parker" | 36  |
+-------------+---------------+-----+

This example uses the MATCH statement to retrieve the vertex property.

nebula> MATCH (v:player{name:"Tony Parker"}) RETURN v;
+-----------------------------------------------------+
| v                                                   |
+-----------------------------------------------------+
| ("player101" :player{age: 36, name: "Tony Parker"}) |
+-----------------------------------------------------+

Last update: March 13, 2023