Export data from NebulaGraph¶

This topic uses an example to illustrate how to use Exchange to export data from NebulaGraph to a CSV file.

Enterpriseonly

Only Exchange Enterprise Edition supports exporting data from NebulaGraph to a CSV file.

Note

SSL encryption is not supported when exporting data from NebulaGraph.

Preparation¶

This example is completed on a virtual machine equipped with Linux. The hardware and software you need to prepare before exporting data are as follows.

Hardware¶

Type	Information
CPU	4 Intel(R) Xeon(R) Platinum 8260 CPU @ 2.30GHz
Memory	16G
Hard disk	50G

System¶

CentOS 7.9.2009

Software¶

Name	Version
JDK	1.8.0
Hadoop	2.10.1
Scala	2.12.11
Spark	2.4.7
NebulaGraph	3.1.0

Dataset¶

As the data source, NebulaGraph stores the basketballplayer dataset in this example, the Schema elements of which are shown as follows.

Element	Name	Property
Tag	`player`	`name string, age int`
Tag	`team`	`name string`
Edge type	`follow`	`degree int`
Edge type	`serve`	`start_year int, end_year int`

Steps¶

Get the JAR file of Exchange Enterprise Edition from the NebulaGraph Enterprise Edition Package.

Modify the configuration file.

Exchange Enterprise Edition provides the configuration template export_application.conf for exporting NebulaGraph data. For details, see Exchange parameters. The core content of the configuration file used in this example is as follows:

...

  # Processing tags
  # There are tag config examples for different dataSources.
  tags: [
    # export NebulaGraph tag data to csv, only support export to CSV for now.
    {
      name: player
      type: {
        source: Nebula
        sink: CSV
      }
      # the path to save the NebulaGrpah data, make sure the path doesn't exist.
      path:"hdfs://192.168.8.177:9000/vertex/player"
      # if no need to export any properties when export NebulaGraph tag data
      # if noField is configured true, just export vertexId
      noField:false
      # define properties to export from NebulaGraph tag data
      # if return.fields is configured as empty list, then export all properties
      return.fields:[]
      # nebula space partition number
      partition:10
    }

...

  ]

  # Processing edges
  # There are edge config examples for different dataSources.
  edges: [
    # export NebulaGraph tag data to csv, only support export to CSV for now.
    {
      name: follow
      type: {
        source: Nebula
        sink: CSV
      }
      # the path to save the NebulaGrpah data, make sure the path doesn't exist.
      path:"hdfs://192.168.8.177:9000/edge/follow"
      # if no need to export any properties when export NebulaGraph edge data
      # if noField is configured true, just export src,dst,rank
      noField:false
      # define properties to export from NebulaGraph edge data
      # if return.fields is configured as empty list, then export all properties
      return.fields:[]
      # nebula space partition number
      partition:10
    }

...

  ]
}

Export data from NebulaGraph with the following command.

<spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange nebula-exchange-x.y.z.jar_path> -c <export_application.conf_path>

The command used in this example is as follows.

$ ./spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange \
  ~/exchange-ent/nebula-exchange-ent-3.0.0.jar -c ~/exchange-ent/export_application.conf

Check the exported data.

Check whether the CSV file is successfully generated under the target path.

$ hadoop fs -ls /vertex/player
Found 11 items
-rw-r--r--   3 nebula supergroup          0 2021-11-05 07:36 /vertex/player/_SUCCESS
-rw-r--r--   3 nebula supergroup        160 2021-11-05 07:36 /vertex/player/    part-00000-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        163 2021-11-05 07:36 /vertex/player/    part-00001-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        172 2021-11-05 07:36 /vertex/player/    part-00002-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        172 2021-11-05 07:36 /vertex/player/    part-00003-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        144 2021-11-05 07:36 /vertex/player/    part-00004-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        173 2021-11-05 07:36 /vertex/player/    part-00005-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        160 2021-11-05 07:36 /vertex/player/    part-00006-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        148 2021-11-05 07:36 /vertex/player/    part-00007-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        125 2021-11-05 07:36 /vertex/player/    part-00008-17293020-ba2e-4243-b834-34495c0536b3-c000.csv
-rw-r--r--   3 nebula supergroup        119 2021-11-05 07:36 /vertex/player/    part-00009-17293020-ba2e-4243-b834-34495c0536b3-c000.csv

Check the contents of the CSV file to ensure that the data export is successful.

Last update: March 13, 2023