Skip to content

Storage Service configurations

Nebula Graph provides two initial configuration files for the Storage Service: nebula-storaged.conf.default and nebula-storaged.conf.production. You can use them in different scenarios. The default file path is /usr/local/nebula/etc/.

NOTE: Raft Listener is different from the Storage Service. For more information, see Raft Listener.

How to use the configuration files

The Storage Service gets its configuration from the nebula-storaged.conf file. You have to remove the suffix .default or .production from an initial configuration file for the Storage Service to apply the configuration defined in it.

If you have modified the configuration in the file and want the new configuration to take effect, add --local_conf=true at the top of the file. Otherwise, Nebula Graph reads the cached configuration.

About parameter values

If a parameter is not set in the configuration file, Nebula Graph uses its default value.

NOTE: The default value of a parameter in Nebula Graph may be different from the predefined value in the .default and .production files.

The predefined parameter in nebula-storaged.conf.default and nebula-storaged.conf.production are different. And not all parameters are predefined. This topic uses the parameters in nebula-storaged.conf.default.

Basic configurations

Name Predefine Value Descriptions
daemonize true When set to true, the process is a daemon process.
pid_file pids/nebula-storaged.pid File to host the process ID.
timezone_name - Specifies the Nebula Graph time zone. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is UTC+00:00:00. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=CST-8 represents the GMT+8 time zone.

NOTE:

  • While inserting time-type property values except timestamps, Nebula Graph transforms them to a UTC time according to the time zone specified with the timezone_name parameter in the configuration files. The time-type values returned by nGQL queries are all UTC time.
  • timezone_name is only used to transform the data stored in Nebula Graph. Other time-related data of the Nebula Graph processes still uses the default time zone of the host, such as the log printing time.

Logging configurations

Name Predefine Value Descriptions
log_dir logs Directory to the Storage Service log. We recommend that you put logs on a different hard disk from the data_path.
minloglevel 0 Specifies the minimum log level. Available values are 0-3. 0, 1, 2, and 3 are INFO, WARNING, ERROR, and FATAL. We suggest that you set minloglevel to 0 for debug, 1 for production. When you set it to 4, Nebula Graph does not print any logs.
v 0 Specifies the verbose log level. Available values are 0-4. The larger the value, the more verbose the log.
logbufsecs 0 Specifies the maximum time to buffer the logs. The configuration is measured in seconds.
redirect_stdout true When set to true, stdout and stderr are redirected.
stdout_log_file storaged-stdout.log Specifies the filename for the stdout log.
stderr_log_file storaged-stderr.log Specifies the filename for the stderr log.
stderrthreshold 2 Specifies the minimum level to copy the log messages to stderr. Available values are 0-3. 0, 1, 2, and 3 are INFO, WARNING, ERROR, and FATAL.

Networking configurations

Name Predefine Value Descriptions
meta_server_addrs 127.0.0.1:9559 Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas.
local_ip 127.0.0.1 Specifies the local IP for the Storage Service.
port 9779 Specifies RPC daemon listening port. The external port for Storage Service is predefined to 9779. The internal ports are predefined to port -2, port -1, and port + 1, i.e., 9777, 9778, and 9780. Nebula Graph uses the internal ports for multi-replica interactions.
ws_ip 0.0.0.0 Specifies the IP address for the HTTP service.
ws_http_port 19779 Specifies the port for the HTTP service.
ws_h2_port 19780 Specifies the port for the HTTP2 service.
heartbeat_interval_secs 10 Specifies the default heartbeat interval in seconds. Make sure the heartbeat_interval_secs values for all services are the same, otherwise Nebula Graph CANNOT work normally.

NOTE: We recommend that you use the real IP address in your configuration because sometimes 127.0.0.1 can not be parsed correctly.

Raft configurations

Name Predefine Value Descriptions
raft_heartbeat_interval_secs 30 Specifies the timeout for the Raft election. The configuration is measured in seconds.
raft_rpc_timeout_ms 500 Specifies the timeout for the Raft RPC. The configuration is measured in milliseconds.
wal_ttl 14400 Specifies the recycle RAFT wal time. The configuration is measured in seconds.

Disk configurations

Name Predefine Value Descriptions
data_path data/storage Specifies the root data path. Separate multiple paths with commas.
rocksdb_batch_size 4096 Specifies the block cache for a batch operation. The configuration is measured in bytes.
rocksdb_block_cache 4 Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes.
engine_type rocksdb Specifies the engine type.
rocksdb_compression lz4 Specifies the compression algorithm for RocksDB. Available values are no, snappy, lz4, lz4hc, zlib, bzip2, and zstd.
rocksdb_compression_per_level \ Specifies compression for each level.
enable_rocksdb_statistics false When set to false, RocksDB statistics is disabled.
rocksdb_stats_level kExceptHistogramOrTimers Specifies the stats level for RocksDB. Available values are kExceptHistogramOrTimers, kExceptTimers, kExceptDetailedTimers, kExceptTimeForMutex, and kAll.
enable_rocksdb_prefix_filtering false When set to true, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter reduces memory usage.
rocksdb_filtering_prefix_length 12 Specifies the prefix length for each key. Available values are 12 and 16.

RocksDB options

The format of the RocksDB options is {"<option_name>":"<option_value>"}. Multiple options are separated with commas.

Name Predefine Value Descriptions
rocksdb_db_options {} Specifies the RocksDB options.
rocksdb_column_family_options {"write_buffer_size":"67108864",
"max_write_buffer_number":"4",
"max_bytes_for_level_base":"268435456"}
Specifies the RocksDB column family options.
rocksdb_block_based_table_options {"block_size":"8192"} Specifies the RocksDB block based table options.

Available rocksdb_db_options and rocksdb_column_family_options are listed as follows.

  • rocksdb_db_options
    max_total_wal_size
    delete_obsolete_files_period_micros
    max_background_jobs
    stats_dump_period_sec
    compaction_readahead_size
    writable_file_max_buffer_size
    bytes_per_sync
    wal_bytes_per_sync
    delayed_write_rate
    avoid_flush_during_shutdown
    max_open_files
    stats_persist_period_sec
    stats_history_buffer_size
    strict_bytes_per_sync
    enable_rocksdb_prefix_filtering
    enable_rocksdb_whole_key_filtering
    rocksdb_filtering_prefix_length
    num_compaction_threads
    rate_limit
    
  • rocksdb_column_family_options

    write_buffer_size
    max_write_buffer_number
    level0_file_num_compaction_trigger
    level0_slowdown_writes_trigger
    level0_stop_writes_trigger
    target_file_size_base
    target_file_size_multiplier
    max_bytes_for_level_base
    max_bytes_for_level_multiplier
    disable_auto_compactions 
    
    For more information about RocksDB configuration, see RocksDB official documentation

For super-Large vertices

For super vertex with a large number of edges, currently there are two truncation strategies:

  1. Truncate directly. Set the enable_reservoir_sampling parameter to false. A certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated by default.

  2. Truncate with the reservoir sampling algorithm. Based on the algorithm, a certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated with equal probability from the total n edges. Equal probability sampling is useful in some business scenarios. However, the performance is affected compared to direct truncation due to the probability calculation.

Storage configuration for large dataset

When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter parameter to true. For example, 100 vertices + 100 edges require 300 key-values. Each key takes 10bit in memory. Then you can calculate your own memory usage.


Last update: April 13, 2021
Back to top