Skip to content

Storage Service configurations

NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default and nebula-storaged.conf.production. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/.

Caution

  • It is not recommended to modify the value of local_config to false. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.
  • It is not recommended to modify the configurations that are not introduced in this topic, unless you are familiar with the source code and fully understand the function of configurations.

How to use the configuration files

To use the initial configuration file, choose one of the above two files and delete the suffix .default or .production from the initial configuration file for the Meta Service to apply the configurations defined in it.

About parameter values

If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default. For parameters that are not included in nebula-metad.conf.default, see nebula-storaged.conf.production.

Note

The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.

For all parameters and their current values, see Configurations.

Basics configurations

Name Predefined value Description
daemonize true When set to true, the process is a daemon process.
pid_file pids/nebula-storaged.pid The file that records the process ID.
timezone_name - Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files. The system default value is UTC+00:00:00. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00 represents the GMT+8 time zone.
local_config true When set to true, the process gets configurations from the configuration files.

Note

  • While inserting property values of time types, NebulaGraph transforms time types (except TIMESTAMP) to the corresponding UTC according to the time zone specified by timezone_name. The time-type values returned by nGQL queries are all UTC.
  • timezone_name is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.

Logging configurations

Name Predefined value Description
log_dir logs The directory that stores the Meta Service log. It is recommended to put logs on a different hard disk from the data.
minloglevel 0 Specifies the minimum level of the log. That is, no logs below this level will be printed. Optional values are 0 (INFO), 1 (WARNING), 2 (ERROR), 3 (FATAL). It is recommended to set it to 0 during debugging and 1 in a production environment. If it is set to 4, NebulaGraph will not print any logs.
v 0 Specifies the detailed level of the log. The larger the value, the more detailed the log is. Optional values are 0, 1, 2, 3.
logbufsecs 0 Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0 means real-time output. This configuration is measured in seconds.
redirect_stdout true When set to true, the process redirects thestdout and stderr to separate output files.
stdout_log_file graphd-stdout.log Specifies the filename for the stdout log.
stderr_log_file graphd-stderr.log Specifies the filename for the stderr log.
stderrthreshold 2 Specifies the minloglevel to be copied to the stderr log.
timestamp_in_logfile_name true Specifies if the log file name contains a timestamp. true indicates yes, false indicates no.

Networking configurations

Name Predefined value Description
meta_server_addrs 127.0.0.1:9559 Specifies the IP addresses and ports of all Meta Services. Multiple addresses are separated with commas.
local_ip 127.0.0.1 Specifies the local IP for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address.
port 9779 Specifies RPC daemon listening port of the Storage service. The external port for the Meta Service is predefined to 9779. The internal port is predefined to 9777, 9778, and 9780. Nebula Graph uses the internal port for multi-replica interactions.
ws_ip 0.0.0.0 Specifies the IP address for the HTTP service.
ws_http_port 19779 Specifies the port for the HTTP service.
heartbeat_interval_secs 10 Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds.

Caution

The real IP address must be used in the configuration file. Otherwise, 127.0.0.1/0.0.0.0 cannot be parsed correctly in some cases.

Raft configurations

Name Predefined value Description
raft_heartbeat_interval_secs 30 Specifies the time to expire the Raft election. The configuration is measured in seconds.
raft_rpc_timeout_ms 500 Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds.
wal_ttl 14400 Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds.

Disk configurations

Name Predefined value Description
data_path data/storage Specifies the data storage path. Multiple paths are separated with commas. One RocksDB example corresponds to one path.
minimum_reserved_bytes 268435456 Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes.
rocksdb_batch_size 4096 Specifies the block cache for a batch operation. The configuration is measured in bytes.
rocksdb_block_cache 4 Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes.
disable_page_cache false Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is false and page cache is enabled. If the value is set to true, page cache is disabled and sufficient block cache space must be configured for NebulaGraph.
engine_type rocksdb Specifies the engine type.
rocksdb_compression lz4 Specifies the compression algorithm for RocksDB. Optional values are no, snappy, lz4, lz4hc, zlib, bzip2, and zstd.
rocksdb_compression_per_level \ Specifies the compression algorithm for each level.
enable_rocksdb_statistics false When set to false, RocksDB statistics is disabled.
rocksdb_stats_level kExceptHistogramOrTimers Specifies the stats level for RocksDB. Optional values are kExceptHistogramOrTimers, kExceptTimers, kExceptDetailedTimers, kExceptTimeForMutex, and kAll.
enable_rocksdb_prefix_filtering true When set to true, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory.
enable_rocksdb_whole_key_filtering false When set to true, the whole key bloom filter for RocksDB is enabled.
rocksdb_filtering_prefix_length 12 Specifies the prefix length for each key. Optional values are 12 and 16. The configuration is measured in bytes.
enable_partitioned_index_filter - When set to true, it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance.

Key-Value separation configurations

Name Predefined value Description
rocksdb_enable_kv_separation false Whether or not to enable BlobDB (RocksDB key-value separation support). This function improves query performance.
rocksdb_kv_separation_threshold 100 RocksDB key value separation threshold. Values at or above this threshold will be written to blob files during flush or compaction. Unit: bytes.
rocksdb_blob_compression lz4 Compression algorithm for BlobDB. Optional values are no, snappy, lz4, lz4hc, zlib, bzip2, and zstd.
rocksdb_enable_blob_garbage_collection true Whether to perform BlobDB garbage collection during compaction.

misc configurations

Caution

The configuration snapshot in the following table is different from the snapshot in NebulaGraph. The snapshot here refers to the stock data on the leader when synchronizing Raft.

Name Predefined value Description
auto_remove_invalid_space true After executing DROP SPACE, the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is true, all the data in the specified graph space will be deleted at the same time.
num_io_threads 16 The number of network I/O threads used to send RPC requests and receive responses.
num_worker_threads 32 The number of worker threads for one RPC-based Storage service.
max_concurrent_subtasks 10 The maximum number of concurrent subtasks to be executed by the task manager.
snapshot_part_rate_limit 10485760 The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s.
snapshot_batch_size 1048576 The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes.
rebuild_index_part_rate_limit 4194304 The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s.
rebuild_index_batch_size 1048576 The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes.

RocksDB options

Name Predefined value Description
rocksdb_db_options {} Specifies the RocksDB database options.
rocksdb_column_family_options {"write_buffer_size":"67108864",
"max_write_buffer_number":"4",
"max_bytes_for_level_base":"268435456"}
Specifies the RocksDB column family options.
rocksdb_block_based_table_options {"block_size":"8192"} Specifies the RocksDB block based table options.

The format of the RocksDB option is {"<option_name>":"<option_value>"}. Multiple options are separated with commas.

Supported options of rocksdb_db_options and rocksdb_column_family_options are listed as follows.

  • rocksdb_db_options
    max_total_wal_size
    delete_obsolete_files_period_micros
    max_background_jobs
    stats_dump_period_sec
    compaction_readahead_size
    writable_file_max_buffer_size
    bytes_per_sync
    wal_bytes_per_sync
    delayed_write_rate
    avoid_flush_during_shutdown
    max_open_files
    stats_persist_period_sec
    stats_history_buffer_size
    strict_bytes_per_sync
    enable_rocksdb_prefix_filtering
    enable_rocksdb_whole_key_filtering
    rocksdb_filtering_prefix_length
    num_compaction_threads
    rate_limit
    
  • rocksdb_column_family_options
    write_buffer_size
    max_write_buffer_number
    level0_file_num_compaction_trigger
    level0_slowdown_writes_trigger
    level0_stop_writes_trigger
    target_file_size_base
    target_file_size_multiplier
    max_bytes_for_level_base
    max_bytes_for_level_multiplier
    disable_auto_compactions 
    

For more information, see RocksDB official documentation.

Storage cache configurations

Enterpriseonly

Only available for the NebulaGraph Enterprise Edition.

Name Predefined value Description
enable_storage_cache false Whether or not to cache Storage data.
storage_cache_capacity 0 The size of memory reserved for Storage caches. The value must be slightly greater than the sum of vertex_pool_capacity and empty_key_pool_capacity. The configuration is measured in MB.
storage_cache_buckets_power 20 The number of buckets. The value is a logarithm with a base of 2. Optional values are 0~32. For example, the value 20 indicates that the number of buckets is 2\(^{20}\). The recommended value is ceil(log2(cacheEntries * 1.6)). cacheEntries indicates the total number of cache items.
storage_cache_locks_power 10 The number of locks. The value is a logarithm with a base of 2. Optional values are 0~32. For example, the value 10 indicates that the number of locks is 2\(^{10}\). The recommended value is max(1, storage_cache_buckets_power - 10).
enable_vertex_pool false Whether or not to add a vertex cache pool. Only valid when the storage cache feature is enabled.
vertex_pool_capacity 50 The size of the vertex cache pool. The configuration is measured in MB.
vertex_item_ttl 300 The TTL of vertex cache pool items. The configuration is measured in seconds.
enable_empty_key_pool false Whether or not to add an empty_key pool in the cache. Only valid when the storage cache is enabled. The empty_key indicates a key that was queried but does not actually exist.
empty_key_pool_capacity 50 The size of the empty_key cache pool. The configuration is measured in MB.
empty_key_item_ttl 300 The TTL of the empty_key cache pool items. The configuration is measured in seconds.

For super-Large vertices

When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex parameter. Excess edges will not be returned. This parameter applies to all spaces.

Property name Default value Description
max_edge_returned_per_vertex 2147483647 Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the configuration files.

Storage configurations for large dataset

When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter parameter to true. The performance is affected because RocksDB indexes are cached.


Last update: February 19, 2024