Export data from NebulaGraph¶
This topic uses an example to illustrate how to use Exchange to export data from NebulaGraph to a CSV file.
Enterpriseonly
Only Exchange Enterprise Edition supports exporting data from NebulaGraph to a CSV file.
Note
SSL encryption is not supported when exporting data from NebulaGraph.
Preparation¶
This example is completed on a virtual machine equipped with Linux. The hardware and software you need to prepare before exporting data are as follows.
Hardware¶
Type | Information |
---|---|
CPU | 4 Intel(R) Xeon(R) Platinum 8260 CPU @ 2.30GHz |
Memory | 16G |
Hard disk | 50G |
System¶
CentOS 7.9.2009
Software¶
Name | Version |
---|---|
JDK | 1.8.0 |
Hadoop | 2.10.1 |
Scala | 2.12.11 |
Spark | 2.4.7 |
NebulaGraph | 3.0.1 |
Dataset¶
As the data source, NebulaGraph stores the basketballplayer dataset in this example, the Schema elements of which are shown as follows.
Element | Name | Property |
---|---|---|
Tag | player |
name string, age int |
Tag | team |
name string |
Edge type | follow |
degree int |
Edge type | serve |
start_year int, end_year int |
Steps¶
-
Get the JAR file of Exchange Enterprise Edition from the NebulaGraph Enterprise Edition Package.
-
Modify the configuration file.
Exchange Enterprise Edition provides the configuration template
export_application.conf
for exporting NebulaGraph data. For details, see Exchange parameters. The core content of the configuration file used in this example is as follows:... # Processing tags # There are tag config examples for different dataSources. tags: [ # export NebulaGraph tag data to csv, only support export to CSV for now. { name: player type: { source: Nebula sink: CSV } # the path to save the NebulaGrpah data, make sure the path doesn't exist. path:"hdfs://192.168.8.177:9000/vertex/player" # if no need to export any properties when export NebulaGraph tag data # if noField is configured true, just export vertexId noField:false # define properties to export from NebulaGraph tag data # if return.fields is configured as empty list, then export all properties return.fields:[] # nebula space partition number partition:10 } ... ] # Processing edges # There are edge config examples for different dataSources. edges: [ # export NebulaGraph tag data to csv, only support export to CSV for now. { name: follow type: { source: Nebula sink: CSV } # the path to save the NebulaGrpah data, make sure the path doesn't exist. path:"hdfs://192.168.8.177:9000/edge/follow" # if no need to export any properties when export NebulaGraph edge data # if noField is configured true, just export src,dst,rank noField:false # define properties to export from NebulaGraph edge data # if return.fields is configured as empty list, then export all properties return.fields:[] # nebula space partition number partition:10 } ... ] }
-
Export data from NebulaGraph with the following command.
<spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange nebula-exchange-x.y.z.jar_path> -c <export_application.conf_path>
The command used in this example is as follows.
$ ./spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange \ ~/exchange-ent/nebula-exchange-ent-3.0.0.jar -c ~/exchange-ent/export_application.conf
-
Check the exported data.
-
Check whether the CSV file is successfully generated under the target path.
$ hadoop fs -ls /vertex/player Found 11 items -rw-r--r-- 3 nebula supergroup 0 2021-11-05 07:36 /vertex/player/_SUCCESS -rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00000-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 163 2021-11-05 07:36 /vertex/player/ part-00001-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00002-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 172 2021-11-05 07:36 /vertex/player/ part-00003-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 144 2021-11-05 07:36 /vertex/player/ part-00004-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 173 2021-11-05 07:36 /vertex/player/ part-00005-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 160 2021-11-05 07:36 /vertex/player/ part-00006-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 148 2021-11-05 07:36 /vertex/player/ part-00007-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 125 2021-11-05 07:36 /vertex/player/ part-00008-17293020-ba2e-4243-b834-34495c0536b3-c000.csv -rw-r--r-- 3 nebula supergroup 119 2021-11-05 07:36 /vertex/player/ part-00009-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-
Check the contents of the CSV file to ensure that the data export is successful.
-