Operating Configuration Requirements

Production Environment

Production Environment Deployment Method

  • 3 metadata service processes metad
  • At least 3 storage service processes storaged
  • At least 3 query engine service processes graphd

None of the above processes need to monopolize a single machine. For example, a cluster of 5 machines: A, B, C, D, E can be deployed as follows:

  • A: metad, storaged, graphd
  • B: metad, storaged, graphd
  • C: metad, storaged, graphd
  • D: storaged, graphd
  • E: storaged, graphd

Do not deploy the same cluster across two IDCs. Each metad process automatically creates and maintains a copy of the metadata, so usually only 3 metad processes are needed. Meanwhile, the number of storaged processes does not affect the copy count of a graph space.

Server Configuration Requirements (Standard)

Take AWS EC2 c5d.12xlarge as an example:

  • CPU: 48 core
  • Memory: 96 GB
  • Storage: 2 * 900 GB, NVMe SSD
  • Linux kernel: 3.9 or higher, check with the command uname -r
  • glibc: 2.12 or higher, check with the command ldd --version

Please refer to the Kernel Configuration Doc for details.

Test Environment

  • 1 metadata service process metad
  • At least 1 storage service process storaged
  • At least 1 query engine service process graphd

For example, a cluster with 3 machines: A, B, C can be deployed as follows:

  • A: metad, storaged, graphd
  • B: storaged, graphd
  • C: storaged, graphd

Server Configuration Requirements (Minimum)

Take AWS EC2 c5d.xlarge as an example:

  • CPU: 4 core
  • Memory: 8 GB
  • Storage: 100 GB, SSD

Resource Estimation (Three Replicas)

  • Storage space (full cluster): number of edges and vertices * average bytes of attributes * 6
  • Memory (full cluster): number of edges and vertices * 15 bytes + number of RocksDB instances * (write_buffer_size * max_write_buffer_number) + rocksdb_block_cache * number of the storaged process, where each directory in the --data_path item in the etc/nebula-storaged.conf file corresponds to a RocksDB instance. You can decrease the memory size of bloomfitler by setting the enable_partitioned_index_filter parameter to true. The number of the storaged process usually equals to the number of the machines in the cluster.
  • Partitions number of a graph space: number of disks in the cluster * (2 to 10), the better performance of the hard disk, the larger the value.
  • Reserve 20% space for memory and hard disk buffer.

About HDD and Gigabit Networks

Nebula Graph is designed for NVMe SSD and 10 Gigabit Network. There is no special adaptation for HDD and gigabit networks. The following are some parameters to be tuned:

  • etc/nebula-storage.conf:
    • --raft_rpc_timeout_ms= 5000 to 10000
    • --rocksdb_batch_size= 4096 to 16384
    • --heartbeat_interval_secs = 30 to 60
    • --raft_heartbeat_interval_secs = 30 to 60
  • etc/nebula-meta.conf:
    • --heartbeat_interval_secs is the same as etc/nebula-storage.conf
  • Spark Writer:
rate: {
      timeout: 5000 to 10000
    }
  • go-importer:
    • batchSize: 10 to 50
    • concurrency: 1 to 10
    • channelBufferSize: 100 to 500
  • The partition value is 2 * cluster HDD number