Operating Configuration Requirements¶

Production Environment¶

Production Environment Deployment Method¶

3 metadata service processes metad
At least 3 storage service processes storaged
At least 3 query engine service processes graphd

None of the above processes need to monopolize a single machine. For example, a cluster of 5 machines: A, B, C, D, E can be deployed as follows:

A: metad, storaged, graphd
B: metad, storaged, graphd
C: metad, storaged, graphd
D: storaged, graphd
E: storaged, graphd

Do not deploy the same cluster across two IDCs. Each metad process automatically creates and maintains a copy of the metadata, so usually only 3 metad processes are needed. Meanwhile, the number of storaged processes does not affect the copy count of a graph space.

Server Configuration Requirements (Standard)¶

Take AWS EC2 c5d.12xlarge as an example:

CPU: 48 core
Memory: 96 GB
Storage: 2 * 900 GB, NVMe SSD
Linux kernel: 3.9 or higher, check with the command uname -r
glibc: 2.12 or higher, check with the command ldd --version

Please refer to the Kernel Configuration Doc for details.

Test Environment¶

1 metadata service process metad
At least 1 storage service process storaged
At least 1 query engine service process graphd

For example, a cluster with 3 machines: A, B, C can be deployed as follows:

A: metad, storaged, graphd
B: storaged, graphd
C: storaged, graphd

Server Configuration Requirements (Minimum)¶

Take AWS EC2 c5d.xlarge as an example:

CPU: 4 core
Memory: 8 GB
Storage: 100 GB, SSD

Resource Estimation (Three Replicas)¶

Storage space (full cluster): number of edges and vertices * average bytes of attributes * 6
Memory (full cluster): number of edges and vertices * 15 bytes + number of RocksDB instances * (write_buffer_size * max_write_buffer_number) + rocksdb_block_cache * number of the storaged process, where each directory in the --data_path item in the etc/nebula-storaged.conf file corresponds to a RocksDB instance. You can decrease the memory size of bloomfitler by setting the enable_partitioned_index_filter parameter to true. The number of the storaged process usually equals to the number of the machines in the cluster.
Partitions number of a graph space: number of disks in the cluster * (2 to 10), the better performance of the hard disk, the larger the value.
Reserve 20% space for memory and hard disk buffer.

About HDD and Gigabit Networks¶

Nebula Graph is designed for NVMe SSD and 10 Gigabit Network. There is no special adaptation for HDD and gigabit networks. The following are some parameters to be tuned:

etc/nebula-storage.conf:
- --raft_rpc_timeout_ms= 5000 to 10000
- --rocksdb_batch_size= 4096 to 16384
- --heartbeat_interval_secs = 30 to 60
- --raft_heartbeat_interval_secs = 30 to 60
etc/nebula-meta.conf:
- --heartbeat_interval_secs is the same as etc/nebula-storage.conf
Spark Writer:

rate: {
      timeout: 5000 to 10000
    }

go-importer:
- batchSize: 10 to 50
- concurrency: 1 to 10
- channelBufferSize: 100 to 500
The partition value is 2 * cluster HDD number

Last update: April 8, 2021