Skip to content

Full-text indexes

Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property.

You can use the WHERE clause to specify the search strings in LOOKUP statements.

Prerequisite

Before using the full-text index, make sure that you have deployed a Elasticsearch cluster and a Listener cluster. For more information, see Deploy Elasticsearch and Deploy Listener.

Precaution

Before using the full-text index, make sure that you know the restrictions.

Full Text Queries

Full-text queries enable you to search for parsed text fields, using a parser with strict syntax to return content based on the query string provided. For details, see Query string query.

Syntax

Create full-text indexes

CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} (<prop_name> [,<prop_name>]...) [ANALYZER="<analyzer_name>"];
  • Composite indexes with multiple properties are supported when creating full-text indexes.
  • <analyzer_name> is the name of the analyzer. The default value is standard. To use other analyzers (e.g. IK Analysis), you need to make sure that the corresponding analyzer is installed in Elasticsearch in advance.

Show full-text indexes

SHOW FULLTEXT INDEXES;

Rebuild full-text indexes

REBUILD FULLTEXT INDEX;

Caution

When there is a large amount of data, rebuilding full-text index is slow, you can modify snapshot_send_files=false in the configuration file of Storage service(nebula-storaged.conf).

Drop full-text indexes

DROP FULLTEXT INDEX <index_name>;

Use query options

LOOKUP ON {<tag> | <edge_type>} WHERE ES_QUERY(<index_name>, "<text>") YIELD <return_list> [| LIMIT [<offset>,] <number_rows>];

<return_list>
    <prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...] [, id(vertex)  [AS <prop_alias>]] [, score() AS <score_alias>]
  • index_name: The name of the full-text index.
  • text: Search conditions. The where can only be followed by the ES_QUERY, and all judgment conditions must be written in the text. For supported syntax, see Query string syntax.
  • score(): The score calculated by doing N degree expansion for the eligible vertices. The default value is 1.0. The higher the score, the higher the degree of match. The return value is sorted by default from highest to lowest score. For details, see Search and Scoring in Lucene.

Examples

// This example creates the graph space.
nebula> CREATE SPACE IF NOT EXISTS basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));

// This example signs in the text service.
nebula> SIGN IN TEXT SERVICE (192.168.8.100:9200, HTTP);

// This example checks the text service status.
nebula> SHOW TEXT SEARCH CLIENTS;
+-----------------+-----------------+------+
| Type            | Host            | Port |
+-----------------+-----------------+------+
| "ELASTICSEARCH" | "192.168.8.100" | 9200 |
+-----------------+-----------------+------+

// This example switches the graph space.
nebula> USE basketballplayer;

// This example adds the listener to the NebulaGraph cluster.
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.100:9789;

// This example checks the listener status. When the status is `Online`, the listener is ready.
nebula> SHOW LISTENER;
+--------+-----------------+------------------------+-------------+
| PartId | Type            | Host                   | Host Status |
+--------+-----------------+------------------------+-------------+
| 1      | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE"    |
| 2      | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE"    |
| 3      | "ELASTICSEARCH" | ""192.168.8.100":9789" | "ONLINE"    |
+--------+-----------------+------------------------+-------------+

// This example creates the tag.
nebula> CREATE TAG IF NOT EXISTS player(name string, city string);

// This example creates a single-attribute full-text index.
nebula> CREATE FULLTEXT TAG INDEX fulltext_index_1 ON player(name) ANALYZER="standard";

// This example creates a multi-attribute full-text indexe.
nebula> CREATE FULLTEXT TAG INDEX fulltext_index_2 ON player(name,city) ANALYZER="standard";

// This example rebuilds the full-text index.
nebula> REBUILD FULLTEXT INDEX;

// This example shows the full-text index.
nebula> SHOW FULLTEXT INDEXES;
+--------------------+-------------+-------------+--------------+------------+
| Name               | Schema Type | Schema Name | Fields       | Analyzer   |
+--------------------+-------------+-------------+--------------+------------+
| "fulltext_index_1" | "Tag"       | "player"    | "name"       | "standard" |
| "fulltext_index_2" | "Tag"       | "player"    | "name, city" | "standard" |
+--------------------+-------------+-------------+--------------+------------+

// This example inserts the test data.
nebula> INSERT VERTEX player(name, city) VALUES \
    "Russell Westbrook": ("Russell Westbrook", "Los Angeles"), \
    "Chris Paul": ("Chris Paul", "Houston"),\
    "Boris Diaw": ("Boris Diaw", "Houston"),\
    "David West": ("David West", "Philadelphia"),\
    "Danny Green": ("Danny Green", "Philadelphia"),\
    "Tim Duncan": ("Tim Duncan", "New York"),\
    "James Harden": ("James Harden", "New York"),\
    "Tony Parker": ("Tony Parker", "Chicago"),\
    "Aron Baynes": ("Aron Baynes", "Chicago"),\
    "Ben Simmons": ("Ben Simmons", "Phoenix"),\
    "Blake Griffin": ("Blake Griffin", "Phoenix");

// These examples run test queries.
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Chris") YIELD id(vertex);
+--------------+
| id(VERTEX)   |
+--------------+
| "Chris Paul" |
+--------------+
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Harden") YIELD properties(vertex);
+----------------------------------------------------------------+
| properties(VERTEX)                                             |
+----------------------------------------------------------------+
| {_vid: "James Harden", city: "New York", name: "James Harden"} |
+----------------------------------------------------------------+

nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"Da*") YIELD properties(vertex);
+------------------------------------------------------------------+
| properties(VERTEX)                                               |
+------------------------------------------------------------------+
| {_vid: "David West", city: "Philadelphia", name: "David West"}   |
| {_vid: "Danny Green", city: "Philadelphia", name: "Danny Green"} |
+------------------------------------------------------------------+

nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex);
+---------------------+
| id(VERTEX)          |
+---------------------+
| "Russell Westbrook" |
| "Boris Diaw"        |
| "Aron Baynes"       |
| "Ben Simmons"       |
| "Blake Griffin"     |
+---------------------+

nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex) | LIMIT 2,3;
+-----------------+
| id(VERTEX)      |
+-----------------+
| "Aron Baynes"   |
| "Ben Simmons"   |
| "Blake Griffin" |
+-----------------+

nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex) | YIELD count(*);
+----------+
| count(*) |
+----------+
| 5        |
+----------+

nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*") YIELD id(vertex), score() AS score;
+---------------------+-------+
| id(VERTEX)          | score |
+---------------------+-------+
| "Russell Westbrook" | 1.0   |
| "Boris Diaw"        | 1.0   |
| "Aron Baynes"       | 1.0   |
| "Ben Simmons"       | 1.0   |
| "Blake Griffin"     | 1.0   |
+---------------------+-------+

// For documents containing a word `b`, its score will be multiplied by a weighting factor of 4, while for documents containing a word `c`, the default weighting factor of 1 is used.
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_1,"*b*^4 OR *c*") YIELD id(vertex), score() AS score;
+---------------------+-------+
| id(VERTEX)          | score |
+---------------------+-------+
| "Russell Westbrook" | 4.0   |
| "Boris Diaw"        | 4.0   |
| "Aron Baynes"       | 4.0   |
| "Ben Simmons"       | 4.0   |
| "Blake Griffin"     | 4.0   |
| "Chris Paul"        | 1.0   |
| "Tim Duncan"        | 1.0   |
+---------------------+-------+

// When using a multi-attribute full-text index query, the conditions are matched within all properties of the index.
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,"*h*") YIELD properties(vertex);
+------------------------------------------------------------------+
| properties(VERTEX)                                               |
+------------------------------------------------------------------+
| {_vid: "Chris Paul", city: "Houston", name: "Chris Paul"}        |
| {_vid: "Boris Diaw", city: "Houston", name: "Boris Diaw"}        |
| {_vid: "David West", city: "Philadelphia", name: "David West"}   |
| {_vid: "James Harden", city: "New York", name: "James Harden"}   |
| {_vid: "Tony Parker", city: "Chicago", name: "Tony Parker"}      |
| {_vid: "Aron Baynes", city: "Chicago", name: "Aron Baynes"}      |
| {_vid: "Ben Simmons", city: "Phoenix", name: "Ben Simmons"}      |
| {_vid: "Blake Griffin", city: "Phoenix", name: "Blake Griffin"}  |
| {_vid: "Danny Green", city: "Philadelphia", name: "Danny Green"} |
+------------------------------------------------------------------+

// When using multi-attribute full-text index queries, you can specify different text for different properties for the query.
nebula> LOOKUP ON player WHERE ES_QUERY(fulltext_index_2,"name:*b* AND city:Houston") YIELD properties(vertex);
+-----------------------------------------------------------+
| properties(VERTEX)                                        |
+-----------------------------------------------------------+
| {_vid: "Boris Diaw", city: "Houston", name: "Boris Diaw"} |
+-----------------------------------------------------------+

// Delete single-attribute full-text index.
nebula> DROP FULLTEXT INDEX fulltext_index_1;

Last update: October 24, 2023