Storage Service configurations¶

NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default and nebula-storaged.conf.production. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/.

Caution

It is not recommended to modify the value of local_config to false. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks.
It is not recommended to modify the configurations that are not introduced in this topic, unless you are familiar with the source code and fully understand the function of configurations.

How to use the configuration files¶

To use the initial configuration file, choose one of the above two files and delete the suffix .default or .production from the initial configuration file for the Meta Service to apply the configurations defined in it.

About parameter values¶

If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default. For parameters that are not included in nebula-metad.conf.default, see nebula-storaged.conf.production.

Caution

Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config value is set to true, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.

Note

The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.

For all parameters and their current values, see Configurations.

Basics configurations¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`daemonize`	`true`	When set to `true`, the process is a daemon process.	No
`pid_file`	`pids/nebula-storaged.pid`	The file that records the process ID.	No
`timezone_name`	`UTC+00:00:00`	Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, `--timezone_name=UTC+08:00` represents the GMT+8 time zone.	No
`local_config`	`true`	When set to `true`, the process gets configurations from the configuration files.	No

Note

While inserting property values of time types, NebulaGraph transforms time types (except TIMESTAMP) to the corresponding UTC according to the time zone specified by timezone_name. The time-type values returned by nGQL queries are all UTC.
timezone_name is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.

Logging configurations¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`log_dir`	`logs`	The directory that stores the Storage service log. It is recommended to put logs on a different hard disk from the data.	No
`minloglevel`	`0`	Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are `0` (INFO), `1` (WARNING), `2` (ERROR), `3` (FATAL). It is recommended to set it to `0` during debugging and `1` in a production environment. If it is set to `4`, NebulaGraph will not print any logs.	Yes
`v`	`0`	Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are `0`, `1`, `2`, `3`, `4`, `5`. The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter `v`. For details, see Verbose Logging.	Yes
`logbufsecs`	`0`	Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. `0` means real-time output. This configuration is measured in seconds.	No
`redirect_stdout`	`true`	When set to `true`, the process redirects the`stdout` and `stderr` to separate output files.	No
`stdout_log_file`	`graphd-stdout.log`	Specifies the filename for the `stdout` log.	No
`stderr_log_file`	`graphd-stderr.log`	Specifies the filename for the `stderr` log.	No
`stderrthreshold`	`3`	Specifies the `minloglevel` to be copied to the `stderr` log.	No
`timestamp_in_logfile_name`	`true`	Specifies if the log file name contains a timestamp. `true` indicates yes, `false` indicates no.	No

Networking configurations¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`meta_server_addrs`	`127.0.0.1:9559`	Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas.	No
`local_ip`	`127.0.0.1`	Specifies the local IP (or hostname) for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address.	No
`port`	`9779`	Specifies RPC daemon listening port of the Storage service. The neighboring ports `-1` (`9778`) and `+1` (`9780`) are also used. `9778`: The port used by the Admin service, which receives Meta commands for Storage. `9780`: The port used for Raft communication between Storage services.	No
`ws_ip`	`0.0.0.0`	Specifies the IP address for the HTTP service.	No
`ws_http_port`	`19779`	Specifies the port for the HTTP service.	No
`heartbeat_interval_secs`	`10`	Specifies the default heartbeat interval. Make sure the `heartbeat_interval_secs` values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds.	Yes

Caution

It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0 cannot be parsed correctly in some cases.

Raft configurations¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`raft_heartbeat_interval_secs`	`30`	Specifies the time to expire the Raft election. The configuration is measured in seconds.	Yes
`raft_rpc_timeout_ms`	`500`	Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds.	Yes
`wal_ttl`	`14400`	Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds.	Yes

Disk configurations¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`data_path`	`data/storage`	Specifies the data storage path. Multiple paths are separated with commas. For NebulaGraph of the community edition, one RocksDB instance corresponds to one path.	No
`minimum_reserved_bytes`	`268435456`	Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes.	No
`rocksdb_batch_size`	`4096`	Specifies the block cache for a batch operation. The configuration is measured in bytes.	No
`rocksdb_block_cache`	`4`	Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes.	No
`disable_page_cache`	`false`	Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is `false` and page cache is enabled. If the value is set to `true`, page cache is disabled and sufficient block cache space must be configured for NebulaGraph.	No
`engine_type`	`rocksdb`	Specifies the engine type.	No
`rocksdb_compression`	`lz4`	Specifies the compression algorithm for RocksDB. Optional values are `no`, `snappy`, `lz4`, `lz4hc`, `zlib`, `bzip2`, and `zstd`. This parameter modifies the compression algorithm for each level. If you want to set different compression algorithms for each level, use the parameter `rocksdb_compression_per_level`.	No
`rocksdb_compression_per_level`	\	Specifies the compression algorithm for each level. The priority is higher than `rocksdb_compression`. For example, `no:no:lz4:lz4:snappy:zstd:snappy`. You can also not set certain levels of compression algorithms, for example, `no:no:lz4:lz4::zstd`, level L4 and L6 use the compression algorithm of `rocksdb_compression`.	No
`enable_rocksdb_statistics`	`false`	When set to `false`, RocksDB statistics is disabled.	No
`rocksdb_stats_level`	`kExceptHistogramOrTimers`	Specifies the stats level for RocksDB. Optional values are `kExceptHistogramOrTimers`, `kExceptTimers`, `kExceptDetailedTimers`, `kExceptTimeForMutex`, and `kAll`.	No
`enable_rocksdb_prefix_filtering`	`true`	When set to `true`, the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory.	No
`enable_rocksdb_whole_key_filtering`	`false`	When set to `true`, the whole key bloom filter for RocksDB is enabled.
`rocksdb_filtering_prefix_length`	`12`	Specifies the prefix length for each key. Optional values are `12` and `16`. The configuration is measured in bytes.	No
`enable_partitioned_index_filter`	`false`	When set to `true`, it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually.	No

RocksDB options¶

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`rocksdb_db_options`	`{}`	Specifies the RocksDB database options.	No
`rocksdb_column_family_options`	`{"write_buffer_size":"67108864",` `"max_write_buffer_number":"4",` `"max_bytes_for_level_base":"268435456"}`	Specifies the RocksDB column family options.	No
`rocksdb_block_based_table_options`	`{"block_size":"8192"}`	Specifies the RocksDB block based table options.	No

The format of the RocksDB option is {"<option_name>":"<option_value>"}. Multiple options are separated with commas.

Supported options of rocksdb_db_options and rocksdb_column_family_options are listed as follows.

rocksdb_db_options

max_total_wal_size
delete_obsolete_files_period_micros
max_background_jobs
stats_dump_period_sec
compaction_readahead_size
writable_file_max_buffer_size
bytes_per_sync
wal_bytes_per_sync
delayed_write_rate
avoid_flush_during_shutdown
max_open_files
stats_persist_period_sec
stats_history_buffer_size
strict_bytes_per_sync
enable_rocksdb_prefix_filtering
enable_rocksdb_whole_key_filtering
rocksdb_filtering_prefix_length
num_compaction_threads
rate_limit

rocksdb_column_family_options

write_buffer_size
max_write_buffer_number
level0_file_num_compaction_trigger
level0_slowdown_writes_trigger
level0_stop_writes_trigger
target_file_size_base
target_file_size_multiplier
max_bytes_for_level_base
max_bytes_for_level_multiplier
disable_auto_compactions

For more information, see RocksDB official documentation.

Misc configurations¶

Caution

The configuration snapshot in the following table is different from the snapshot in NebulaGraph. The snapshot here refers to the stock data on the leader when synchronizing Raft.

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`query_concurrently`	`true`	Whether to turn on multi-threaded queries. Enabling it can improve the latency performance of individual queries, but it will reduce the overall throughput under high pressure.	Yes
`auto_remove_invalid_space`	`true`	After executing `DROP SPACE`, the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is `true`, all the data in the specified graph space will be deleted at the same time.	Yes
`num_io_threads`	`16`	The number of network I/O threads used to send RPC requests and receive responses.	No
`num_max_connections`	`0`	Max active connections for all networking threads. 0 means no limit. Max connections for each networking thread = num_max_connections / num_netio_threads	No
`num_worker_threads`	`32`	The number of worker threads for one RPC-based Storage service.	No
`max_concurrent_subtasks`	`10`	The maximum number of concurrent subtasks to be executed by the task manager.	No
`snapshot_part_rate_limit`	`10485760`	The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s.	Yes
`snapshot_batch_size`	`1048576`	The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes.	Yes
`rebuild_index_part_rate_limit`	`4194304`	The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s.	Yes
`rebuild_index_batch_size`	`1048576`	The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes.	Yes

Memory Tracker configurations¶

Note

Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Storage service.

Create the directory /sys/fs/cgroup/storaged/, and then add and configure the memory.max file under the directory.

Add the following configurations to etc/nebula-storaged.conf.

--containerized=true
--cgroup_v2_controllers=/sys/fs/cgroup/storaged/cgroup.controllers
--cgroup_v2_memory_stat_path=/sys/fs/cgroup/storaged/memory.stat
--cgroup_v2_memory_max_path=/sys/fs/cgroup/storaged/memory.max
--cgroup_v2_memory_current_path=/sys/fs/cgroup/storaged/memory.current

For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.

Name	Predefined value	Description	Whether supports runtime dynamic modifications
`memory_tracker_limit_ratio`	`0.8`	The value of this parameter can be set to `(0, 1]`, `2`, and `3`. `(0, 1]`: The percentage of available memory. Formula: `Percentage of available memory = Available memory / (Total memory - Reserved memory)`. When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of `memory_tracker_limit_ratio` should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than `0.5`. `2`: Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. `3`: Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded.	Yes
`memory_tracker_untracked_reserved_memory_mb`	`50`	The reserved memory that is not tracked by the Memory Tracker. Unit: MB.	Yes
`memory_tracker_detail_log`	`false`	Whether to enable the Memory Tracker log. When the value is `true`, the Memory Tracker log is generated.	Yes
`memory_tracker_detail_log_interval_ms`	`60000`	The time interval for generating the Memory Tracker log. Unit: Millisecond. `memory_tracker_detail_log` is `true` when this parameter takes effect.	Yes
`memory_purge_enabled`	`true`	Whether to enable the memory purge feature. When the value is `true`, the memory purge feature is enabled.	Yes
`memory_purge_interval_seconds`	`10`	The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if `memory_purge_enabled` is set to true.	Yes

For super-Large vertices¶

When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex parameter. Excess edges will not be returned. This parameter applies to all spaces.

Property name	Default value	Description	Whether supports runtime dynamic modifications
max_edge_returned_per_vertex	`2147483647`	Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually.	No

Storage configurations for large dataset¶

Warning

One graph space takes up at least about 300 MB of memory.

When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter parameter to true. The performance is affected because RocksDB indexes are cached.

Last update: September 6, 2024