Storage Configurations

This document introduces the storaged configuration file. The default directory is /usr/local/nebula/etc/. If you have customized your Nebula Graph installation directory, your configuration file path is $pwd/nebula/etc/.

  • The *.default files are used for daily debugging. When you start the services, they are the default configuration files.
  • The *.production files are used for the recommended production. When they are used for production, the .production suffix must be removed.

Basic Configurations

Property Default Value Default Value
daemonize true Run as daemon thread
pid_file "pids/nebula-metad.pid" File to hold the process ID.

Logging Configurations

Property Default Value Descriptions Dynamic Modification
log_dir logs (i.e. /usr/local/nebula/logs) Directory to storaged log. It is recommended to put it on a different hard disk from data_path.
minloglevel 0 The corresponding log levels are INFO(DEBUG), WARNING, ERROR and FATAL. Usually specified as 0 in debug, 1 in production. The minloglevel to 4 prints no logs. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.
v 0 0-4: when minloglevel is set to 0, you can further set the severity level of the debug log. The larger the value, the more detailed the log. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.
slow_op_threshhold_ms 50 (ms) default threshhold for slow operation Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.

For example, change the storage log level to v=1 with the following command.

nebula> UPDATE CONFIGS storage:v=1;

Networking Configurations

Property Default Value Descriptions Dynamic Modification
meta_server_addrs "127.0.0.1:45500" List of meta server addresses, the format looks like ip1:port1, ip2:port2, ip3:port3.
port 44500 RPC daemon's listen port. The external port for the Storage service is 44500. The internal port+1, namely 44501, is used for the multi-replica interactions.
reuse_port true Whether to turn on the SO_REUSEPORT option.
ws_http_port 12000 HTTP Protocol daemon port. (For internal use))
ws_h2_port 12002 HTTP/2 Protocol daemon port. (For internal use)
ws_ip "127.0.0.1" web service to bind to
heartbeat_interval_secs 10 (seconds) Seconds between each heartbeat. The same as the parameter in the nebula-storage.conf file. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.
raft_heartbeat_interval_secs 5 (seconds) RAFT seconds between each heartbeat. Modify the configuration file and restart service.
raft_rpc_timeout_ms 500 (ms) RPC timeout for raft client. Modify the configuration file and restart service.

NOTE: We recommend you using the actual IP in the meta_server_addrs parameter because sometimes 127.0.0.1 will not be parsed correctly.

Data persistence setting for storage

Property Default Value Descriptions
data_path data/storage (i.e. /usr/local/nebula/data/storage/) The root directory for the local data persistence. If multiple directories exist, use commas to separate the directories. For RocksDB engine, one path one instance.
auto_remove_invalid_space false Whether to remove data from a deleted graph space when restarting the services.

Separate directories when using multiple hard disks. Each directory corresponds to a RocksDB instance for better concurrency. For example:

--data_path=/disk1/storage/,/disk2/storage/,/disk3/storage/

RocksDB Options

Property Default Value Descriptions Dynamic Modification
rocksdb_batch_size 4096 (B) Batch Write
rocksdb_block_cache 1024 (MB) block cache siez. Suggest set to 1/3 of the machine memory
rocksdb_disable_wal true Whether to disable the WAL in RocksDB.
wal_ttl 14400 (seconds) RAFT wal time Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.
rocksdb_db_options {} jJson string of DBOptions, all keys and values are string. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately. Overwrite all json
rocksdb_column_family_options {} Json string of ColumnFamilyOptions, all keys and values are string. Details see below. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately. Overwrite all json
rocksdb_block_based_table_options {} Json string of BlockBasedTableOptions, all keys and values are string. Details see below. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately. Overwrite all json

rocksdb_db_options

max_total_wal_size
delete_obsolete_files_period_micros
max_background_jobs
stats_dump_period_sec
compaction_readahead_size
writable_file_max_buffer_size
bytes_per_sync
wal_bytes_per_sync
delayed_write_rate
avoid_flush_during_shutdown
max_open_files
stats_persist_period_sec
stats_history_buffer_size
strict_bytes_per_sync
enable_rocksdb_prefix_filtering
enable_rocksdb_whole_key_filtering
rocksdb_filtering_prefix_length
num_compaction_threads
rate_limit

The parameters above can either be dynamically modified by the UPDATE CONFIGS syntax, or written in the local configuration file. Please refer to the RocksDB manual for specific functions and whether restarting is needed.

rocksdb_column_family_options

write_buffer_size
max_write_buffer_number
level0_file_num_compaction_trigger
level0_slowdown_writes_trigger
level0_stop_writes_trigger
target_file_size_base
target_file_size_multiplier
max_bytes_for_level_base
max_bytes_for_level_multiplier
disable_auto_compactions       -- Compact automatically when writing data is stopped, default value is false. Dynamic modification takes effect immediately.

The preceding parameters can either be dynamically modified by UPDATE CONFIGS syntax, or written in the local configuration file. Please refer to the RocksDB manual for specific functions and whether restarting is needed.

The preceding parameters can be set via the command line as follows:

nebula> UPDATE CONFIGS storage:rocksdb_column_family_options = \
        { disable_auto_compactions = false, level0_file_num_compaction_trigger = 10 };
        -- The command overwrites rocksdb_column_family_options. Please note whether other sub-items will be overwritten

nebula> UPDATE CONFIGS storage:rocksdb_db_options  = \
        { max_subcompactions = 10, max_background_jobs = 10};
nebula> UPDATE CONFIGS storage:max_edge_returned_per_vertex = 10; -- The parameter is explained below

We recommend the following configuration:

    rocksdb_db_options = {"stats_dump_period_sec":"200", "write_thread_max_yield_usec":"600"}
    rocksdb_column_family_options = {"max_write_buffer_number":"4", "min_write_buffer_number_to_merge":"2", "max_write_buffer_number_to_maintain":"1"}
    rocksdb_block_based_table_options = {"block_restart_interval":"2"}

Description on Super-Large Vertices

For super vertex with a large number of edges, currently there are two truncation strategies:

  1. Truncate directly. Set the enable_reservoir_sampling parameter to false. A certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated by default.

  2. Truncate with the reservoir sampling algorithm. Based on the algorithm, a certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated with equal probability from the total n edges. Equal probability sampling is useful in some business scenarios. However, the performance is effected compared to direct truncation due to the probability calculation.

For example:

nebula> UPDATE CONFIGS storage:enable_reservoir_sampling = false;

Truncating Directly

Property Default Value Descriptions Dynamic Modification
max_edge_returned_per_vertex 2147483647 The max returned edges of each super-large vertex. The excess edges are truncated and not returned.
Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.

Reservoir Sampling Truncation

Property Default Value Descriptions Dynamic Modification
enable_reservoir_sampling false Truncated with equal probability from the total n edges. Modified with UPDATE CONFIGS syntax. The modification takes effect immediately.

Storage Configuration When You Have a Lot of Data

If you have a lot of data (in the RocksDB directory) and your memory is tight, we recommend you setting the enable_partitioned_index_filter parameter in the storage configuration to true. For example, 100 vertices + 100 edges occupies 300 keys. Each key is 10bit or 3k. Then you can calculate your own data storage.