VID¶
In NebulaGraph, a vertex is uniquely identified by its ID, which is called a VID or a Vertex ID.
Features¶
- The data types of VIDs are restricted to
FIXED_STRING(<N>)
orINT64
. One graph space can only select one VID type.
- A VID in a graph space is unique. It functions just as a primary key in a relational database. VIDs in different graph spaces are independent.
- The VID generation method must be set by users, because NebulaGraph does not provide auto increasing ID, or UUID.
-
Vertices with the same VID will be identified as the same one. For example:
- A VID is the unique identifier of an entity, like a person's ID card number. A tag means the type of an entity, such as driver, and boss. Different tags define two groups of different properties, such as driving license number, driving age, order amount, order taking alt, and job number, payroll, debt ceiling, business phone number.
- When two
INSERT
statements (neither uses a parameter ofIF NOT EXISTS
) with the same VID and tag are operated at the same time, the latterINSERT
will overwrite the former.
- When two
INSERT
statements with the same VID but different tags, likeTAG A
andTAG B
, are operated at the same time, the operation ofTag A
will not affectTag B
.
- VIDs will usually be indexed and stored into memory (in the way of LSM-tree). Thus, direct access to VIDs enjoys peak performance.
VID Operation¶
- NebulaGraph 1.x only supports
INT64
while NebulaGraph 2.x supportsINT64
andFIXED_STRING(<N>)
. InCREATE SPACE
, VID types can be set viavid_type
.
id()
function can be used to specify or locate a VID.
LOOKUP
orMATCH
statements can be used to find a VID via property index.
- Direct access to vertices statements via VIDs enjoys peak performance, such as
DELETE xxx WHERE id(xxx) == "player100"
orGO FROM "player100"
. Finding VIDs via properties and then operating the graph will cause poor performance, such asLOOKUP | GO FROM $-.ids
, which will run bothLOOKUP
and|
one more time.
VID Generation¶
VIDs can be generated via applications. Here are some tips:
- (Optimal) Directly take a unique primary key or property as a VID. Property access depends on the VID.
- Generate a VID via a unique combination of properties. Property access depends on property index.
- Generate a VID via algorithms like snowflake. Property access depends on property index.
- If short primary keys greatly outnumber long primary keys, do not enlarge the
N
ofFIXED_STRING(<N>)
too much. Otherwise, it will occupy a lot of memory and hard disks, and slow down performance. Generate VIDs via BASE64, MD5, hash by encoding and splicing.
- If you generate int64 VID via hash, the probability of collision is about 1/10 when there are 1 billion vertices. The number of edges has no concern with the probability of collision.
Define and modify a VID and its data type¶
The data type of a VID must be defined when you create the graph space. Once defined, it cannot be modified.
A VID is set when you insert a vertex and cannot be modified.
Query start vid
and global scan¶
In most cases, the execution plan of query statements in NebulaGraph (MATCH
, GO
, and LOOKUP
) must query the start vid
in a certain way.
There are only two ways to locate start vid
:
-
For example,
GO FROM "player100" OVER
explicitly indicates in the statement thatstart vid
is "player100". -
For example,
LOOKUP ON player WHERE player.name == "Tony Parker"
orMATCH (v:player {name:"Tony Parker"})
locatesstart vid
by the index of the propertyplayer.name
.
Caution
For example, match (n) return n;
returns an error: Scan vertices or edges need to specify a limit number, or limit number can not push down.
, because it is a global scan, you must use the LIMIT
clause to limit the number of returns.