Storage Service configurations¶

Nebula Graph provides two initial configuration files for the Storage Service: nebula-storaged.conf.default and nebula-storaged.conf.production. You can use them in different scenarios. The default file path is /usr/local/nebula/etc/.

NOTE: Raft Listener is different from the Storage Service. For more information, see Raft Listener.

How to use the configuration files¶

The Storage Service gets its configuration from the nebula-storaged.conf file. You have to remove the suffix .default or .production from an initial configuration file for the Storage Service to apply the configuration defined in it.

If you have modified the configuration in the file and want the new configuration to take effect, add --local_conf=true at the top of the file. Otherwise, Nebula Graph reads the cached configuration.

About parameter values¶

If a parameter is not set in the configuration file, Nebula Graph uses its default value.

NOTE: The default value of a parameter in Nebula Graph may be different from the predefined value in the .default and .production files.

The predefined parameter in nebula-storaged.conf.default and nebula-storaged.conf.production are different. And not all parameters are predefined. This topic uses the parameters in nebula-storaged.conf.default.

Basic configurations¶

Name	Predefine Value	Descriptions
`daemonize`	`true`	When set to `true`, the process is a daemon process.
`pid_file`	`pids/nebula-storaged.pid`	File to host the process ID.
`timezone_name`	-	Specifies the Nebula Graph time zone. This parameter is not predefined in the initial configuration files. You can manually set it if you need it. The system default value is `UTC+00:00:00`. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, `--timezone_name=CST-8` represents the GMT+8 time zone.

NOTE:

While inserting time-type property values except timestamps, Nebula Graph transforms them to a UTC time according to the time zone specified with the timezone_name parameter in the configuration files. The time-type values returned by nGQL queries are all UTC time.

timezone_name is only used to transform the data stored in Nebula Graph. Other time-related data of the Nebula Graph processes still uses the default time zone of the host, such as the log printing time.

Logging configurations¶

Name	Predefine Value	Descriptions
`log_dir`	`logs`	Directory to the Storage Service log. We recommend that you put logs on a different hard disk from the `data_path`.
`minloglevel`	`0`	Specifies the minimum log level. Available values are 0-3. `0`, `1`, `2`, and `3` are `INFO`, `WARNING`, `ERROR`, and `FATAL`. We suggest that you set `minloglevel` to `0` for debug, `1` for production. When you set it to `4`, Nebula Graph does not print any logs.
`v`	`0`	Specifies the verbose log level. Available values are 0-4. The larger the value, the more verbose the log.
`logbufsecs`	`0`	Specifies the maximum time to buffer the logs. The configuration is measured in seconds.
`redirect_stdout`	`true`	When set to `true`, stdout and stderr are redirected.
`stdout_log_file`	`storaged-stdout.log`	Specifies the filename for the stdout log.
`stderr_log_file`	`storaged-stderr.log`	Specifies the filename for the stderr log.
`stderrthreshold`	`2`	Specifies the minimum level to copy the log messages to stderr. Available values are 0-3. `0`, `1`, `2`, and `3` are `INFO`, `WARNING`, `ERROR`, and `FATAL`.

Networking configurations¶

Name	Predefine Value	Descriptions
`meta_server_addrs`	`127.0.0.1:9559`	Specifies the IP addresses and ports of all Meta Services. Separate multiple addresses with commas.
`local_ip`	`127.0.0.1`	Specifies the local IP for the Storage Service.
`port`	`9779`	Specifies RPC daemon listening port. The external port for Storage Service is predefined to `9779`. The internal ports are predefined to `port -2`, `port -1`, and `port + 1`, i.e., `9777`, `9778`, and `9780`. Nebula Graph uses the internal ports for multi-replica interactions.
`ws_ip`	`0.0.0.0`	Specifies the IP address for the HTTP service.
`ws_http_port`	`19779`	Specifies the port for the HTTP service.
`ws_h2_port`	`19780`	Specifies the port for the HTTP2 service.
`heartbeat_interval_secs`	`10`	Specifies the default heartbeat interval in seconds. Make sure the `heartbeat_interval_secs` values for all services are the same, otherwise Nebula Graph CANNOT work normally.

NOTE: We recommend that you use the real IP address in your configuration because sometimes 127.0.0.1 can not be parsed correctly.

Raft configurations¶

Name	Predefine Value	Descriptions
`raft_heartbeat_interval_secs`	`30`	Specifies the timeout for the Raft election. The configuration is measured in seconds.
`raft_rpc_timeout_ms`	`500`	Specifies the timeout for the Raft RPC. The configuration is measured in milliseconds.
`wal_ttl`	`14400`	Specifies the recycle RAFT wal time. The configuration is measured in seconds.

Disk configurations¶

Name	Predefine Value	Descriptions
`data_path`	`data/storage`	Specifies the root data path. Separate multiple paths with commas.
`rocksdb_batch_size`	`4096`	Specifies the block cache for a batch operation. The configuration is measured in bytes.
`rocksdb_block_cache`	`4`	Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes.
`engine_type`	`rocksdb`	Specifies the engine type.
`rocksdb_compression`	`lz4`	Specifies the compression algorithm for RocksDB. Available values are `no`, `snappy`, `lz4`, `lz4hc`, `zlib`, `bzip2`, and `zstd`.
`rocksdb_compression_per_level`	\	Specifies compression for each level.
`enable_rocksdb_statistics`	`false`	When set to `false`, RocksDB statistics is disabled.
`rocksdb_stats_level`	`kExceptHistogramOrTimers`	Specifies the stats level for RocksDB. Available values are `kExceptHistogramOrTimers`, `kExceptTimers`, `kExceptDetailedTimers`, `kExceptTimeForMutex`, and `kAll`.
`enable_rocksdb_prefix_filtering`	`false`	When set to `true`, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter reduces memory usage.
`rocksdb_filtering_prefix_length`	`12`	Specifies the prefix length for each key. Available values are `12` and `16`.

RocksDB options¶

The format of the RocksDB options is {"<option_name>":"<option_value>"}. Multiple options are separated with commas.

Name	Predefine Value	Descriptions
`rocksdb_db_options`	`{}`	Specifies the RocksDB options.
`rocksdb_column_family_options`	`{"write_buffer_size":"67108864",` `"max_write_buffer_number":"4",` `"max_bytes_for_level_base":"268435456"}`	Specifies the RocksDB column family options.
`rocksdb_block_based_table_options`	`{"block_size":"8192"}`	Specifies the RocksDB block based table options.

Available rocksdb_db_options and rocksdb_column_family_options are listed as follows.

rocksdb_db_options

max_total_wal_size
delete_obsolete_files_period_micros
max_background_jobs
stats_dump_period_sec
compaction_readahead_size
writable_file_max_buffer_size
bytes_per_sync
wal_bytes_per_sync
delayed_write_rate
avoid_flush_during_shutdown
max_open_files
stats_persist_period_sec
stats_history_buffer_size
strict_bytes_per_sync
enable_rocksdb_prefix_filtering
enable_rocksdb_whole_key_filtering
rocksdb_filtering_prefix_length
num_compaction_threads
rate_limit

rocksdb_column_family_options

write_buffer_size
max_write_buffer_number
level0_file_num_compaction_trigger
level0_slowdown_writes_trigger
level0_stop_writes_trigger
target_file_size_base
target_file_size_multiplier
max_bytes_for_level_base
max_bytes_for_level_multiplier
disable_auto_compactions

For more information about RocksDB configuration, see RocksDB official documentation。

For super-Large vertices¶

For super vertex with a large number of edges, currently there are two truncation strategies:

Truncate directly. Set the enable_reservoir_sampling parameter to false. A certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated by default.
Truncate with the reservoir sampling algorithm. Based on the algorithm, a certain number of edges specified in the Max_edge_returned_per_vertex parameter are truncated with equal probability from the total n edges. Equal probability sampling is useful in some business scenarios. However, the performance is affected compared to direct truncation due to the probability calculation.

Storage configuration for large dataset¶

When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter parameter to true. For example, 100 vertices + 100 edges require 300 key-values. Each key takes 10bit in memory. Then you can calculate your own memory usage.

Last update: April 13, 2021