Full-text indexes¶
Full-text indexes are used to do prefix, wildcard, regexp, and fuzzy search on a string property.
You can use the WHERE
clause to specify the search strings in LOOKUP
statements.
Prerequisite¶
Before using the full-text index, make sure that you have deployed a Elasticsearch cluster and a Listener cluster. For more information, see Deploy Elasticsearch and Deploy Listener.
Precaution¶
Before using the full-text index, make sure that you know the restrictions.
Natural language full-text search¶
A natural language search interprets the search string as a phrase in natural human language. The search is case-insensitive. By default, each substring (separated by spaces) will be searched separately. For example, there are three vertices with the tag player
. The tag player
contains the property name
. The name
of these three vertices are Kevin Durant
, Tim Duncan
, and David Beckham
. Now that the full-text index of player.name
is established, these three vertices will be queried when using the prefix search statement LOOKUP ON player WHERE PREFIX(player.name,"d");
.
Syntax¶
Create full-text indexes¶
CREATE FULLTEXT {TAG | EDGE} INDEX <index_name> ON {<tag_name> | <edge_name>} ([<prop_name_list>]);
Show full-text indexes¶
SHOW FULLTEXT INDEXES;
Rebuild full-text indexes¶
REBUILD FULLTEXT INDEX;
Drop full-text indexes¶
DROP FULLTEXT INDEX <index_name>;
Use query options¶
LOOKUP ON {<tag> | <edge_type>} WHERE <expression> [YIELD <return_list>];
<expression> ::=
PREFIX | WILDCARD | REGEXP | FUZZY
<return_list>
<prop_name> [AS <prop_alias>] [, <prop_name> [AS <prop_alias>] ...]
- PREFIX(schema_name.prop_name, prefix_string, row_limit, timeout)
- WILDCARD(schema_name.prop_name, wildcard_string, row_limit, timeout)
- REGEXP(schema_name.prop_name, regexp_string, row_limit, timeout)
-
FUZZY(schema_name.prop_name, fuzzy_string, fuzziness, operator, row_limit, timeout)
fuzziness
(optional): Maximum edit distance allowed for matching. The default value isAUTO
. For other valid values and more information, see Elasticsearch document.
operator
(optional): Boolean logic used to interpret the text. Valid values areOR
(default) andAND
.
row_limit
(optional): Specifies the number of rows to return. The default value is100
.
timeout
(optional): Specifies the timeout time. The default value is200ms
.
Examples¶
// This example creates the graph space.
nebula> CREATE SPACE IF NOT EXISTS basketballplayer (partition_num=3,replica_factor=1, vid_type=fixed_string(30));
// This example signs in the text service.
nebula> SIGN IN TEXT SERVICE (127.0.0.1:9200, HTTP);
// This example switches the graph space.
nebula> USE basketballplayer;
// This example adds the listener to the NebulaGraph cluster.
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.5:9789;
// This example creates the tag.
nebula> CREATE TAG IF NOT EXISTS player(name string, age int);
// This example creates the native index.
nebula> CREATE TAG INDEX IF NOT EXISTS name ON player(name(20));
// This example rebuilds the native index.
nebula> REBUILD TAG INDEX;
// This example creates the full-text index. The index name starts with "nebula".
nebula> CREATE FULLTEXT TAG INDEX nebula_index_1 ON player(name);
// This example rebuilds the full-text index.
nebula> REBUILD FULLTEXT INDEX;
// This example shows the full-text index.
nebula> SHOW FULLTEXT INDEXES;
+------------------+-------------+-------------+--------+
| Name | Schema Type | Schema Name | Fields |
+------------------+-------------+-------------+--------+
| "nebula_index_1" | "Tag" | "player" | "name" |
+------------------+-------------+-------------+--------+
// This example inserts the test data.
nebula> INSERT VERTEX player(name, age) VALUES \
"Russell Westbrook": ("Russell Westbrook", 30), \
"Chris Paul": ("Chris Paul", 33),\
"Boris Diaw": ("Boris Diaw", 36),\
"David West": ("David West", 38),\
"Danny Green": ("Danny Green", 31),\
"Tim Duncan": ("Tim Duncan", 42),\
"James Harden": ("James Harden", 29),\
"Tony Parker": ("Tony Parker", 36),\
"Aron Baynes": ("Aron Baynes", 32),\
"Ben Simmons": ("Ben Simmons", 22),\
"Blake Griffin": ("Blake Griffin", 30);
// These examples run test queries.
nebula> LOOKUP ON player WHERE PREFIX(player.name, "B") YIELD id(vertex);
+-----------------+
| id(VERTEX) |
+-----------------+
| "Boris Diaw" |
| "Ben Simmons" |
| "Blake Griffin" |
+-----------------+
nebula> LOOKUP ON player WHERE WILDCARD(player.name, "*ri*") YIELD player.name, player.age;
+-----------------+-----+
| name | age |
+-----------------+-----+
| "Chris Paul" | 33 |
| "Boris Diaw" | 36 |
| "Blake Griffin" | 30 |
+-----------------+-----+
nebula> LOOKUP ON player WHERE WILDCARD(player.name, "*ri*") | YIELD count(*);
+----------+
| count(*) |
+----------+
| 3 |
+----------+
nebula> LOOKUP ON player WHERE REGEXP(player.name, "R.*") YIELD player.name, player.age;
+---------------------+-----+
| name | age |
+---------------------+-----+
| "Russell Westbrook" | 30 |
+---------------------+-----+
nebula> LOOKUP ON player WHERE REGEXP(player.name, ".*") YIELD id(vertex);
+---------------------+
| id(VERTEX) |
+---------------------+
| "Danny Green" |
| "David West" |
...
nebula> LOOKUP ON player WHERE FUZZY(player.name, "Tim Dunncan", AUTO, OR) YIELD player.name;
+--------------+
| name |
+--------------+
| "Tim Duncan" |
+--------------+
// This example drops the full-text index.
nebula> DROP FULLTEXT INDEX nebula_index_1;