Storage Service configurations¶
NebulaGraph provides two initial configuration files for the Storage Service, nebula-storaged.conf.default
and nebula-storaged.conf.production
. Users can use them in different scenarios conveniently. The default file path is /usr/local/nebula/etc/
.
Caution
- It is not recommended to modify the value of
local_config
tofalse
. If modified, the NebulaGraph service will first read the cached configurations, which may cause configuration inconsistencies between clusters and cause unknown risks. - It is not recommended to modify the configurations that are not introduced in this topic, unless you are familiar with the source code and fully understand the function of configurations.
How to use the configuration files¶
To use the initial configuration file, choose one of the above two files and delete the suffix .default
or .production
from the initial configuration file for the Meta Service to apply the configurations defined in it.
About parameter values¶
If a parameter is not set in the configuration file, NebulaGraph uses the default value. Not all parameters are predefined. And the predefined parameters in the two initial configuration files are different. This topic uses the parameters in nebula-metad.conf.default
. For parameters that are not included in nebula-metad.conf.default
, see nebula-storaged.conf.production
.
Caution
Some parameter values in the configuration file can be dynamically modified during runtime. We label these parameters as Yes that supports runtime dynamic modification in this article. When the local_config
value is set to true
, the dynamically modified configuration is not persisted, and the configuration will be restored to the initial configuration after the service is restarted. For more information, see Modify configurations.
Note
The configurations of the Raft Listener and the Storage service are different. For details, see Deploy Raft listener.
For all parameters and their current values, see Configurations.
Basics configurations¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
daemonize |
true |
When set to true , the process is a daemon process. |
No |
pid_file |
pids/nebula-storaged.pid |
The file that records the process ID. | No |
timezone_name |
UTC+00:00:00 |
Specifies the NebulaGraph time zone. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. For the format of the parameter value, see Specifying the Time Zone with TZ. For example, --timezone_name=UTC+08:00 represents the GMT+8 time zone. |
No |
local_config |
true |
When set to true , the process gets configurations from the configuration files. |
No |
Note
- While inserting property values of time types, NebulaGraph transforms time types (except TIMESTAMP) to the corresponding UTC according to the time zone specified by
timezone_name
. The time-type values returned by nGQL queries are all UTC. timezone_name
is only used to transform the data stored in NebulaGraph. Other time-related data of the NebulaGraph processes still uses the default time zone of the host, such as the log printing time.
Logging configurations¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
log_dir |
logs |
The directory that stores the Storage service log. It is recommended to put logs on a different hard disk from the data. | No |
minloglevel |
0 |
Specifies the minimum level of the log. That is, log messages at or above this level. Optional values are 0 (INFO), 1 (WARNING), 2 (ERROR), 3 (FATAL). It is recommended to set it to 0 during debugging and 1 in a production environment. If it is set to 4 , NebulaGraph will not print any logs. |
Yes |
v |
0 |
Specifies the detailed level of VLOG. That is, log all VLOG messages less or equal to the level. Optional values are 0 , 1 , 2 , 3 , 4 , 5 . The VLOG macro provided by glog allows users to define their own numeric logging levels and control verbose messages that are logged with the parameter v . For details, see Verbose Logging. |
Yes |
logbufsecs |
0 |
Specifies the maximum time to buffer the logs. If there is a timeout, it will output the buffered log to the log file. 0 means real-time output. This configuration is measured in seconds. |
No |
redirect_stdout |
true |
When set to true , the process redirects thestdout and stderr to separate output files. |
No |
stdout_log_file |
graphd-stdout.log |
Specifies the filename for the stdout log. |
No |
stderr_log_file |
graphd-stderr.log |
Specifies the filename for the stderr log. |
No |
stderrthreshold |
3 |
Specifies the minloglevel to be copied to the stderr log. |
No |
timestamp_in_logfile_name |
true |
Specifies if the log file name contains a timestamp. true indicates yes, false indicates no. |
No |
Networking configurations¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
meta_server_addrs |
127.0.0.1:9559 |
Specifies the IPs (or hostnames) and ports of all Meta Services. Multiple addresses are separated with commas. | No |
local_ip |
127.0.0.1 |
Specifies the local IP (or hostname) for the Storage Service. The local IP address is used to identify the nebula-storaged process. If it is a distributed cluster or requires remote access, modify it to the corresponding address. | No |
port |
9779 |
Specifies RPC daemon listening port of the Storage service. The neighboring ports -1 (9778 ) and +1 (9780 ) are also used. 9778 : The port used by the Admin service, which receives Meta commands for Storage. 9780 : The port used for Raft communication between Storage services. |
No |
ws_ip |
0.0.0.0 |
Specifies the IP address for the HTTP service. | No |
ws_http_port |
19779 |
Specifies the port for the HTTP service. | No |
heartbeat_interval_secs |
10 |
Specifies the default heartbeat interval. Make sure the heartbeat_interval_secs values for all services are the same, otherwise NebulaGraph CANNOT work normally. This configuration is measured in seconds. |
Yes |
Caution
It is recommended to use a real IP when using IP address. Otherwise, 127.0.0.1/0.0.0.0
cannot be parsed correctly in some cases.
Raft configurations¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
raft_heartbeat_interval_secs |
30 |
Specifies the time to expire the Raft election. The configuration is measured in seconds. | Yes |
raft_rpc_timeout_ms |
500 |
Specifies the time to expire the Raft RPC. The configuration is measured in milliseconds. | Yes |
wal_ttl |
14400 |
Specifies the lifetime of the RAFT WAL. The configuration is measured in seconds. | Yes |
Disk configurations¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
data_path |
data/storage |
Specifies the data storage path. Multiple paths are separated with commas. For NebulaGraph of the community edition, one RocksDB instance corresponds to one path. | No |
minimum_reserved_bytes |
268435456 |
Specifies the minimum remaining space of each data storage path. When the value is lower than this standard, the cluster data writing may fail. This configuration is measured in bytes. | No |
rocksdb_batch_size |
4096 |
Specifies the block cache for a batch operation. The configuration is measured in bytes. | No |
rocksdb_block_cache |
4 |
Specifies the block cache for BlockBasedTable. The configuration is measured in megabytes. | No |
disable_page_cache |
false |
Enables or disables the operating system's page cache for NebulaGraph. By default, the parameter value is false and page cache is enabled. If the value is set to true , page cache is disabled and sufficient block cache space must be configured for NebulaGraph. |
No |
engine_type |
rocksdb |
Specifies the engine type. | No |
rocksdb_compression |
lz4 |
Specifies the compression algorithm for RocksDB. Optional values are no , snappy , lz4 , lz4hc , zlib , bzip2 , and zstd .This parameter modifies the compression algorithm for each level. If you want to set different compression algorithms for each level, use the parameter rocksdb_compression_per_level . |
No |
rocksdb_compression_per_level |
\ | Specifies the compression algorithm for each level. The priority is higher than rocksdb_compression . For example, no:no:lz4:lz4:snappy:zstd:snappy .You can also not set certain levels of compression algorithms, for example, no:no:lz4:lz4::zstd , level L4 and L6 use the compression algorithm of rocksdb_compression . |
No |
enable_rocksdb_statistics |
false |
When set to false , RocksDB statistics is disabled. |
No |
rocksdb_stats_level |
kExceptHistogramOrTimers |
Specifies the stats level for RocksDB. Optional values are kExceptHistogramOrTimers , kExceptTimers , kExceptDetailedTimers , kExceptTimeForMutex , and kAll . |
No |
enable_rocksdb_prefix_filtering |
true |
When set to true , the prefix bloom filter for RocksDB is enabled. Enabling prefix bloom filter makes the graph traversal faster but occupies more memory. |
No |
enable_rocksdb_whole_key_filtering |
false |
When set to true , the whole key bloom filter for RocksDB is enabled. |
|
rocksdb_filtering_prefix_length |
12 |
Specifies the prefix length for each key. Optional values are 12 and 16 . The configuration is measured in bytes. |
No |
enable_partitioned_index_filter |
false |
When set to true , it reduces the amount of memory used by the bloom filter. But in some random-seek situations, it may reduce the read performance. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. |
No |
RocksDB options¶
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
rocksdb_db_options |
{} |
Specifies the RocksDB database options. | No |
rocksdb_column_family_options |
{"write_buffer_size":"67108864", "max_write_buffer_number":"4", "max_bytes_for_level_base":"268435456"} |
Specifies the RocksDB column family options. | No |
rocksdb_block_based_table_options |
{"block_size":"8192"} |
Specifies the RocksDB block based table options. | No |
The format of the RocksDB option is {"<option_name>":"<option_value>"}
. Multiple options are separated with commas.
Supported options of rocksdb_db_options
and rocksdb_column_family_options
are listed as follows.
rocksdb_db_options
max_total_wal_size delete_obsolete_files_period_micros max_background_jobs stats_dump_period_sec compaction_readahead_size writable_file_max_buffer_size bytes_per_sync wal_bytes_per_sync delayed_write_rate avoid_flush_during_shutdown max_open_files stats_persist_period_sec stats_history_buffer_size strict_bytes_per_sync enable_rocksdb_prefix_filtering enable_rocksdb_whole_key_filtering rocksdb_filtering_prefix_length num_compaction_threads rate_limit
rocksdb_column_family_options
write_buffer_size max_write_buffer_number level0_file_num_compaction_trigger level0_slowdown_writes_trigger level0_stop_writes_trigger target_file_size_base target_file_size_multiplier max_bytes_for_level_base max_bytes_for_level_multiplier disable_auto_compactions
For more information, see RocksDB official documentation.
Misc configurations¶
Caution
The configuration snapshot
in the following table is different from the snapshot in NebulaGraph. The snapshot
here refers to the stock data on the leader when synchronizing Raft.
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
query_concurrently |
true |
Whether to turn on multi-threaded queries. Enabling it can improve the latency performance of individual queries, but it will reduce the overall throughput under high pressure. | Yes |
auto_remove_invalid_space |
true |
After executing DROP SPACE , the specified graph space will be deleted. This parameter sets whether to delete all the data in the specified graph space at the same time. When the value is true , all the data in the specified graph space will be deleted at the same time. |
Yes |
num_io_threads |
16 |
The number of network I/O threads used to send RPC requests and receive responses. | No |
num_max_connections |
0 |
Max active connections for all networking threads. 0 means no limit. Max connections for each networking thread = num_max_connections / num_netio_threads |
No |
num_worker_threads |
32 |
The number of worker threads for one RPC-based Storage service. | No |
max_concurrent_subtasks |
10 |
The maximum number of concurrent subtasks to be executed by the task manager. | No |
snapshot_part_rate_limit |
10485760 |
The rate limit when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes/s. | Yes |
snapshot_batch_size |
1048576 |
The amount of data sent in each batch when the Raft leader synchronizes the stock data with other members of the Raft group. Unit: bytes. | Yes |
rebuild_index_part_rate_limit |
4194304 |
The rate limit when the Raft leader synchronizes the index data rate with other members of the Raft group during the index rebuilding process. Unit: bytes/s. | Yes |
rebuild_index_batch_size |
1048576 |
The amount of data sent in each batch when the Raft leader synchronizes the index data with other members of the Raft group during the index rebuilding process. Unit: bytes. | Yes |
Memory Tracker configurations¶
Note
Memory Tracker is a memory management tool designed to monitor and limit memory usage. For large-scale queries, Memory Tracker can prevent Out Of Memory (OOM) issues. If you're using Memory Tracker in a containerized environment, you need to add the relevant configurations to the configuration file of the Storage service.
- Create the directory
/sys/fs/cgroup/storaged/
, and then add and configure thememory.max
file under the directory. -
Add the following configurations to
etc/nebula-storaged.conf
.--containerized=true --cgroup_v2_controllers=/sys/fs/cgroup/storaged/cgroup.controllers --cgroup_v2_memory_stat_path=/sys/fs/cgroup/storaged/memory.stat --cgroup_v2_memory_max_path=/sys/fs/cgroup/storaged/memory.max --cgroup_v2_memory_current_path=/sys/fs/cgroup/storaged/memory.current
For more details, see Memory Tracker: Memory Management Practice in NebulaGraph Database.
Name | Predefined value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
memory_tracker_limit_ratio |
0.8 |
The value of this parameter can be set to (0, 1] , 2 , and 3 .(0, 1] : The percentage of available memory. Formula: Percentage of available memory = Available memory / (Total memory - Reserved memory) .When an ongoing query results in memory usage exceeding the configured limit, the query fails and subsequently the memory is released. Note: For the hybrid deployment of a cluster with cloud-based and on-premises nodes, the value of memory_tracker_limit_ratio should be set to a lower value. For example, when the graphd is expected to occupy only 50% of memory, the value can be set to less than 0.5 .2 : Dynamic Self Adaptive mode. MemoryTracker dynamically adjusts the available memory based on the system's current available memory. Note: This feature is experimental. As memory usage cannot be monitored in real time in dynamic adaptive mode, an OOM error may still occur to handle large memory allocations. 3 : Disable MemoryTracker. MemoryTracker only logs memory usage and does not interfere with executions even if the limit is exceeded. |
Yes |
memory_tracker_untracked_reserved_memory_mb |
50 |
The reserved memory that is not tracked by the Memory Tracker. Unit: MB. | Yes |
memory_tracker_detail_log |
false |
Whether to enable the Memory Tracker log. When the value is true , the Memory Tracker log is generated. |
Yes |
memory_tracker_detail_log_interval_ms |
60000 |
The time interval for generating the Memory Tracker log. Unit: Millisecond. memory_tracker_detail_log is true when this parameter takes effect. |
Yes |
memory_purge_enabled |
true |
Whether to enable the memory purge feature. When the value is true , the memory purge feature is enabled. |
Yes |
memory_purge_interval_seconds |
10 |
The time interval for the memory purge feature to purge memory. Unit: Second. This parameter only takes effect if memory_purge_enabled is set to true. |
Yes |
For super-Large vertices¶
When the query starting from each vertex gets an edge, truncate it directly to avoid too many neighboring edges on the super-large vertex, because a single query occupies too much hard disk and memory. Or you can truncate a certain number of edges specified in the Max_edge_returned_per_vertex
parameter. Excess edges will not be returned. This parameter applies to all spaces.
Property name | Default value | Description | Whether supports runtime dynamic modifications |
---|---|---|---|
max_edge_returned_per_vertex | 2147483647 |
Specifies the maximum number of edges returned for each dense vertex. Excess edges are truncated and not returned. This parameter is not predefined in the initial configuration files, if you need to use this parameter, add it manually. | No |
Storage configurations for large dataset¶
Warning
One graph space takes up at least about 300 MB of memory.
When you have a large dataset (in the RocksDB directory) and your memory is tight, we suggest that you set the enable_partitioned_index_filter
parameter to true
. The performance is affected because RocksDB indexes are cached.