VID¶

In Nebula Graph, a vertex is uniquely identified by its ID, which is called a VID or a Vertex ID.

Features¶

The data types of VIDs are restricted to FIXED_STRING(<N>) or INT64; a graph space can only select one VID type.

A VID in a graph space is unique. It functions just as a primary key in a relational database. VIDs in different graph spaces are independent.

The VID generation method must be set by users, because Nebula Graph does not provide auto increasing ID, or UUID.

Vertices with the same VID will be identified as the same one. For example:
- A VID is the unique identifier of an entity, like a person's ID card number. A tag means the type of an entity, such as driver, and boss. Different tags define two groups of different properties, such as driving license number, driving age, order amount, order taking alt, and job number, payroll, debt ceiling, business phone number.
- When two INSERT statements (neither uses a parameter of IF NOT EXISTS) with the same VID and tag are operated at the same time, the latter INSERT will overwrite the former.
- When two INSERT statements with the same VID but different tags, like TAG A and TAG B, are operated at the same time, the operation of Tag A will not affect Tag B.

VIDs will usually be indexed and stored into memory (in the way of LSM-tree). Thus, direct access to VIDs enjoys peak performance.

Nebula Graph 1.x only supports INT64 while Nebula Graph 2.x supports INT64 and FIXED_STRING(<N>). In CREATE SPACE, VID types can be set via vid_type.

Direct access to vertices statements via VIDs enjoys peak performance, such as DELETE xxx WHERE id(xxx) == "player100" or GO FROM "player100". Finding VIDs via properties and then operating the graph will cause poor performance, such as LOOKUP | GO FROM $-.ids, which will run both LOOKUP and | one more time.

VIDs can be generated via applications. Here are some tips:

(Optimal) Directly take a unique primary key or property as a VID. Property access depends on the VID.

Generate a VID via a unique combination of properties. Property access depends on property index.

Generate a VID via algorithms like snowflake. Property access depends on property index.

If short primary keys greatly outnumber long primary keys, do not enlarge the N of FIXED_STRING(<N>) too much. Otherwise, it will occupy a lot of memory and hard disks, and slow down performance. Generate VIDs via BASE64, MD5, hash by encoding and splicing.

If you generate inte64 VID via hash, the probability of collision is about 1/10 when there are 1 billion vertices. The number of edges has no concern with the probability of collision.

The data type of VIDs must be defined when you create the graph space. Once defined, it cannot be modified.

In most cases, the execution plan of query statements in Nebula Graph (MATCH, GO, and LOOKUP) must query the start vid in a certain way.

There are only two ways to locate start vid:

For example, GO FROM "player100" OVER explicitly indicates in the statement that start vid is "player100".
For example, LOOKUP ON player WHERE player.name == "Tony Parker" or MATCH (v:player {name:"Tony Parker"}) locates start vid by the index of the property player.name.

You cannot perform a global scan without start vid

For example, match (n) return n; returns an error because start vid cannot be located at this time. As a global scan, it is forbidden.

Last update: September 3, 2021