Options for import¶

After editing the configuration file, run the following commands to import specified source data into the NebulaGraph database.

First import

<spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path>

Import the reload file

If some data fails to be imported during the first import, the failed data will be stored in the reload file. Use the parameter -r to import the reload file.

<spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> -r "<reload_file_path>"

Note

The version number of a JAR file is subject to the name of the JAR file that is actually compiled.

Faq

If users use the yarn-cluster mode to submit a job, see the following command, especially the two '--conf' commands in the example.

$SPARK_HOME/bin/spark-submit     --master yarn-cluster \
--class com.vesoft.nebula.exchange.Exchange \
--files application.conf \
--conf spark.driver.extraClassPath=./ \
--conf spark.executor.extraClassPath=./ \
nebula-exchange-3.5.0.jar \
-c application.conf

The following table lists command parameters.

Parameter	Required	Default value	Description
`--class`	Yes	-	Specify the main class of the driver.
`--master`	Yes	-	Specify the URL of the master process in a Spark cluster. For more information, see master-urls.
`-c` / `--config`	Yes	-	Specify the path of the configuration file.
`-h` / `--hive`	No	`false`	Indicate support for importing Hive data.
`-D` / `--dry`	No	`false`	Check whether the format of the configuration file meets the requirements, but it does not check whether the configuration items of `tags` and `edges` are correct. This parameter cannot be added when users import data.
`-r` / `--reload`	No	-	Specify the path of the reload file that needs to be reloaded.

For more Spark parameter configurations, see Spark Configuration.

Last update: December 21, 2022