Skip to content

Storage Service configurations

NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default and nebula-storaged.conf.production. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/.

Caution

  • It is not recommended to modify the value of local_config to false. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.
  • It is not recommended to modify the configurations that are not introduced in this topic, unless you are familiar with the source code and fully understand the function of configurations.

How to use the configuration files

To use the initial configuration file, choose one of the above two files and delete the suffix .default or .production from the initial configuration file for the Meta Service to apply the configurations defined in it.

About parameter values

If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default. For parameters that are not included in nebula-metad.conf.default, see nebula-storaged.conf.production.

Caution

Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config value is set to true, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.

Note

The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.

For all parameters and their current values, see Configurations.

Basics configurations

Name Predefined value Description Whether supports runtime dynamic modifications
daemonize true When set to true, the process is a daemon process. No
pid_file pids/nebula-storaged.pid The file that records the process ID. No
timezone_name UTC+00:00:00 Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00 represents the GMT+8 time zone. No
local_config true When set to true, the process gets configurations from the configuration files. No

Note

  • While inserting property values of time types, NebulaGraph transforms time types (except TIMESTAMP) to the corresponding UTC according to the time zone specified by timezone_name. The time-type values returned by nGQL queries are all UTC.
  • timezone_name is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.

Logging configurations

Name Predefined value Description Whether supports runtime dynamic modifications
log_dir logs The directory that stores the Storage service log. It is recommended to put logs on a different hard disk from the data. No
minloglevel 0 Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0 (INFO), 1 (WARNING), 2 (ERROR), 3 (FATAL). It is recommended to set it to 0 during debugging and 1 in a production environment. If it is set to 4, NebulaGraph will not print any logs. Yes
v 0 Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0, 1, 2, 3, 4, 5. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v. For details, see Verbose Logging. Yes
logbufsecs 0 Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0 means real-time output. This configuration is measured in seconds. No
redirect_stdout true When set to true, the process redirects thestdout and stderr to separate output files. No
stdout_log_file graphd-stdout.log Specifies the filename for the stdout log. No
stderr_log_file graphd-stderr.log Specifies the filename for the stderr log. No
stderrthreshold 3 Specifies the minloglevel to be copied to the stderr log. No
timestamp_in_logfile_name true Specifies if the log file name contains a timestamp. true indicates yes, false indicates no. No

Networking configurations

Name Predefined value Description Whether supports runtime dynamic modifications
meta_server_addrs 127.0.0.1:9559 Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. No
local_ip 127.0.0.1 Specifies the local IP (or hostname) for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. No
port 9779 Specifies RPC daemon listening port of the Storage service. The neighboring ports -1 (9778) and +1 (9780) are also used.
9778: The port used by the Admin service, which receives Meta commands for Storage.
9780: The port used for Raft communication between Storage services.
No
ws_ip 0.0.0.0 Specifies the IP address for the HTTP service. No
ws_http_port 19779 Specifies the port for the HTTP service. No
heartbeat_interval_secs 10 Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. Yes

Caution

It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0 cannot be parsed correctly in some cases.

Raft configurations

Name Predefined value Description Whether supports runtime dynamic modifications
raft_heartbeat_interval_secs 30 Specifies the time to expire the Raft election. The configuration is measured in seconds. Yes
raft_rpc_timeout_ms 500 Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds. Yes
wal_ttl 14400 Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds. Yes

Disk configurations

Name Predefined value Description Whether supports runtime dynamic modifications
data_path data/storage Specifies the data storage path. Multiple paths are separated with commas. For NebulaGraph of the community edition, one RocksDB instance corresponds to one path. No
minimum_reserved_bytes 268435456 Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes. No
rocksdb_batch_size 4096 Specifies the block cache for a batch operation. The configuration is measured in bytes. No
rocksdb_block_cache 4 Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes. No
disable_page_cache false Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is false and page cache is enabled. If the value is set to true, page cache is disabled and sufficient block cache space must be configured for NebulaGraph. No
engine_type rocksdb Specifies the engine type. No
rocksdb_compression lz4 Specifies the compression algorithm for RocksDB. Optional values are no, snappy, lz4, lz4hc, zlib, bzip2, and zstd.
This parameter modifies the compression algorithm for each level. If you want to set different compression algorithms for each level, use the parameter rocksdb_compression_per_level.
No
rocksdb_compression_per_level \ Specifies the compression algorithm for each level. The priority is higher than rocksdb_compression. For example, no:no:lz4:lz4:snappy:zstd:snappy.
You can also not set certain levels of compression algorithms, for example, no:no:lz4:lz4::zstd, level L4 and L6 use the compression algorithm of rocksdb_compression.
No
enable_rocksdb_statistics false When set to false, RocksDB statistics is disabled. No
rocksdb_stats_level kExceptHistogramOrTimers Specifies the stats level for RocksDB. Optional values are kExceptHistogramOrTimers, kExceptTimers, kExceptDetailedTimers, kExceptTimeForMutex, and kAll. No
enable_rocksdb_prefix_filtering true When set to true, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory. No
enable_rocksdb_whole_key_filtering false When set to true, the whole key bloom filter for RocksDB is enabled.
rocksdb_filtering_prefix_length 12 Specifies the prefix length for each key. Optional values are 12 and 16. The configuration is measured in bytes. No
enable_partitioned_index_filter false When set to true, it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No

RocksDB options

Name Predefined value Description Whether supports runtime dynamic modifications
rocksdb_db_options {} Specifies the RocksDB database options. No
rocksdb_column_family_options {"write_buffer_size":"67108864",
"max_write_buffer_number":"4",
"max_bytes_for_level_base":"268435456"}
Specifies the RocksDB column family options. No
rocksdb_block_based_table_options {"block_size":"8192"} Specifies the RocksDB block based table options. No

The format of the RocksDB option is {"<option_name>":"<option_value>"}. Multiple options are separated with commas.

Supported options of rocksdb_db_options and rocksdb_column_family_options are listed as follows.

  • rocksdb_db_options
    max_total_wal_size
    delete_obsolete_files_period_micros
    max_background_jobs
    stats_dump_period_sec
    compaction_readahead_size
    writable_file_max_buffer_size
    bytes_per_sync
    wal_bytes_per_sync
    delayed_write_rate
    avoid_flush_during_shutdown
    max_open_files
    stats_persist_period_sec
    stats_history_buffer_size
    strict_bytes_per_sync
    enable_rocksdb_prefix_filtering
    enable_rocksdb_whole_key_filtering
    rocksdb_filtering_prefix_length
    num_compaction_threads
    rate_limit
    
  • rocksdb_column_family_options
    write_buffer_size
    max_write_buffer_number
    level0_file_num_compaction_trigger
    level0_slowdown_writes_trigger
    level0_stop_writes_trigger
    target_file_size_base
    target_file_size_multiplier
    max_bytes_for_level_base
    max_bytes_for_level_multiplier
    disable_auto_compactions 
    

For more information, see RocksDB official documentation.

Misc configurations

Caution

The configuration snapshot in the following table is different from the snapshot in NebulaGraph. The snapshot here refers to the stock data on the leader when synchronizing Raft.

Name Predefined value Description Whether supports runtime dynamic modifications
query_concurrently true Whether to turn on multi-threaded queries. Enabling it can improve the latency performance of individual queries, but it will reduce the overall throughput under high pressure. Yes
auto_remove_invalid_space true After executing DROP SPACE, the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is true, all the data in the specified graph space will be deleted at the same time. Yes
num_io_threads 16 The number of network I/O threads used to send RPC requests and receive responses. No
num_max_connections 0 Max active connections for all networking threads. 0 means no limit.
Max connections for each networking thread = num_max_connections / num_netio_threads
No
num_worker_threads 32 The number of worker threads for one RPC-based Storage service. No
max_concurrent_subtasks 10 The maximum number of concurrent subtasks to be executed by the task manager. No
snapshot_part_rate_limit 10485760 The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s. Yes
snapshot_batch_size 1048576 The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes. Yes
rebuild_index_part_rate_limit 4194304 The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s. Yes
rebuild_index_batch_size 1048576 The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes. Yes

Memory Tracker configurations

Note

Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Storage service.

  1. Create the directory /sys/fs/cgroup/storaged/, and then add and configure the memory.max file under the directory.
  2. Add the following configurations to etc/nebula-storaged.conf.

    --containerized=true
    --cgroup_v2_controllers=/sys/fs/cgroup/graphd/cgroup.controllers
    --cgroup_v2_memory_stat_path=/sys/fs/cgroup/graphd/memory.stat
    --cgroup_v2_memory_max_path=/sys/fs/cgroup/graphd/memory.max
    --cgroup_v2_memory_current_path=/sys/fs/cgroup/graphd/memory.current
    

For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.

Name Predefined value Description Whether supports runtime dynamic modifications
memory_tracker_limit_ratio 0.8 The value of this parameter can be set to (0, 1], 2, and 3.
(0, 1]: The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory).
When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released.
Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5.
2: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory.
Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations.
3: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded.
Yes
memory_tracker_untracked_reserved_memory_mb 50 The reserved memory that is not tracked by the Memory Tracker. Unit: MB. Yes
memory_tracker_detail_log false Whether to enable the Memory Tracker log. When the value is true, the Memory Tracker log is generated. Yes
memory_tracker_detail_log_interval_ms 60000 The time interval for generating the Memory Tracker log. Unit: Millisecond. memory_tracker_detail_log is true when this parameter takes effect. Yes
memory_purge_enabled true Whether to enable the memory purge feature. When the value is true, the memory purge feature is enabled. Yes
memory_purge_interval_seconds 10 The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled is set to true. Yes

For super-Large vertices

When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex parameter. Excess edges will not be returned. This parameter applies to all spaces.

Property name Default value Description Whether supports runtime dynamic modifications
max_edge_returned_per_vertex 2147483647 Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. No

Storage configurations for large dataset

Warning

One graph space takes up at least about 300 MB of memory.

When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter parameter to true. The performance is affected because RocksDB indexes are cached.


Last update: April 19, 2024