Operating Configuration Requirements¶
Production Environment¶
Production Environment Deployment Method¶
- 3 metadata service processes
metad
- At least 3 storage service processes
storaged
- At least 3 query engine service processes
graphd
None of the above processes need to monopolize a single machine. For example, a cluster of 5 machines: A, B, C, D, E can be deployed as follows:
- A: metad, storaged, graphd
- B: metad, storaged, graphd
- C: metad, storaged, graphd
- D: storaged, graphd
- E: storaged, graphd
Do not deploy the same cluster across two IDCs. Each metad process automatically creates and maintains a copy of the metadata, so usually only 3 metad processes are needed. Meanwhile, the number of storaged processes does not affect the copy count of a graph space.
Server Configuration Requirements (Standard)¶
Take AWS EC2 c5d.12xlarge as an example:
- CPU: 48 core
- Memory: 96 GB
- Storage: 2 * 900 GB, NVMe SSD
- Linux kernel: 3.9 or higher, check with the command
uname -r
- glibc: 2.12 or higher, check with the command
ldd --version
Please refer to the Kernel Configuration Doc for details.
Test Environment¶
- 1 metadata service process
metad
- At least 1 storage service process
storaged
- At least 1 query engine service process
graphd
For example, a cluster with 3 machines: A, B, C can be deployed as follows:
- A: metad, storaged, graphd
- B: storaged, graphd
- C: storaged, graphd
Server Configuration Requirements (Minimum)¶
Take AWS EC2 c5d.xlarge as an example:
- CPU: 4 core
- Memory: 8 GB
- Storage: 100 GB, SSD
Resource Estimation (Three Replicas)¶
- Storage space (full cluster): number of edges and vertices * average bytes of attributes * 6
- Memory (full cluster): number of edges and vertices * 15 bytes + number of RocksDB instances * (write_buffer_size * max_write_buffer_number) + rocksdb_block_cache * number of the storaged process, where each directory in the --data_path item in the etc/nebula-storaged.conf file corresponds to a RocksDB instance. You can decrease the memory size of bloomfitler by setting the enable_partitioned_index_filter parameter to true. The number of the storaged process usually equals to the number of the machines in the cluster.
- Partitions number of a graph space: number of disks in the cluster * (2 to 10), the better performance of the hard disk, the larger the value.
- Reserve 20% space for memory and hard disk buffer.
About HDD and Gigabit Networks¶
Nebula Graph is designed for NVMe SSD and 10 Gigabit Network. There is no special adaptation for HDD and gigabit networks. The following are some parameters to be tuned:
- etc/nebula-storage.conf:
- --raft_rpc_timeout_ms= 5000 to 10000
- --rocksdb_batch_size= 4096 to 16384
- --heartbeat_interval_secs = 30 to 60
- --raft_heartbeat_interval_secs = 30 to 60
- etc/nebula-meta.conf:
- --heartbeat_interval_secs is the same as etc/nebula-storage.conf
- Spark Writer:
rate: {
timeout: 5000 to 10000
}
- go-importer:
- batchSize: 10 to 50
- concurrency: 1 to 10
- channelBufferSize: 100 to 500
- The partition value is 2 * cluster HDD number