Skip to content

Options for import

After editing the configuration file, run the following commands to import specified source data into the NebulaGraph database.

  • First import

    <spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> 
    
  • Import the reload file

    If some data fails to be imported during the first import, the failed data will be stored in the reload file. Use the parameter -r to import the reload file.

    <spark_install_path>/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange <nebula-exchange-2.x.y.jar_path> -c <application.conf_path> -r "<reload_file_path>" 
    

Note

The version number of a JAR file is subject to the name of the JAR file that is actually compiled.

Note

If users use the yarn-cluster mode to submit a job, see the following command:

$SPARK_HOME/bin/spark-submit     --master yarn-cluster \
--class com.vesoft.nebula.exchange.Exchange \
--files application.conf \
--conf spark.driver.extraClassPath=./ \
--conf spark.executor.extraClassPath=./ \
nebula-exchange-3.0.0.jar \
-c application.conf

The following table lists command parameters.

Parameter Required Default value Description
--class  Yes - Specify the main class of the driver.
--master  Yes - Specify the URL of the master process in a Spark cluster. For more information, see master-urls.
-c  / --config  Yes - Specify the path of the configuration file.
-h  / --hive  No false Indicate support for importing Hive data.
-D  / --dry  No false Check whether the format of the configuration file meets the requirements, but it does not check whether the configuration items of tags and edges are correct. This parameter cannot be added when users import data.
-r / --reload No - Specify the path of the reload file that needs to be reloaded.

For more Spark parameter configurations, see Spark Configuration.


Last update: February 1, 2023