Prepare resources for compiling, installing, and running Nebula Graph¶
This topic describes the requirements and suggestions for compiling and installing Nebula Graph, as well as how to estimate the resource you need to reserve for running a Nebula Graph cluster.
Reading guide¶
If you are reading this topic with the questions listed below, click them to jump to their answers.
- What do I need to compile Nebula Graph?
- What do I need to run Nebula Graph in a test environment?
- What do I need to run Nebula Graph in a production environment?
- How much memory and disk space do I need to reserve for my Nebula Graph cluster?
Requirements for compiling the Nebula Graph source code¶
Hardware requirements for compiling Nebula Graph¶
Item | Requirement |
---|---|
CPU architecture | x86_64 |
Memory | 4 GB |
Disk | 10 GB, SSD |
Supported operating systems for compiling Nebula Graph¶
For now, we can only compile Nebula Graph in the Linux system. We recommend that you use any Linux system with kernel version 2.6.32 or above.
Software requirements for compiling Nebula Graph¶
You must have the correct version of the software listed below to compile Nebula Graph. If they are not as required or you are not sure, follow the steps in Prepare software for compiling Nebula Graph to get them ready.
Software | Version | Note |
---|---|---|
glibc | 2.12 or above | You can run ldd --version to check the glibc version. |
make | Any stable version | - |
m4 | Any stable version | - |
git | Any stable version | - |
wget | Any stable version | - |
unzip | Any stable version | - |
xz | Any stable version | - |
readline-devel | Any stable version | - |
ncurses-devel | Any stable version | - |
zlid-devel | Any stable version | - |
gcc | 7.1.0 or above | You can run gcc -v to check the gcc version. |
gcc-c++ | Any stable version | - |
cmake | 3.5.0 or above | You can run cmake --version to check the cmake version. |
gettext | Any stable version | - |
curl | Any stable version | - |
redhat-lsb-core | Any stable version | - |
libstdc++-static | Any stable version | Only needed in CentOS 8+, RedHat 8+, and Fedora systems. |
libasan | Any stable version | Only needed in CentOS 8+, RedHat 8+, and Fedora systems. |
Other third-party software will be automatically downloaded and installed to the build directory at the configure (cmake) stage.
Prepare software for compiling Nebula Graph¶
This section guides you through the downloading and installation of software required for compiling Nebula Graph.
-
Install dependencies.
- For CentOS, RedHat, and Fedora users, run the following commands.
```bash $ yum update $ yum install -y make \ m4 \ git \ wget \ unzip \ xz \ readline-devel \ ncurses-devel \ zlib-devel \ gcc \ gcc-c++ \ cmake \ gettext \ curl \ redhat-lsb-core // For CentOS 8+, RedHat 8+, and Fedora, install libstdc++-static, libasan as well $ yum install -y libstdc++-static libasan ```
- For Debian and Ubuntu users, run the following commands.
```bash $ apt-get update $ apt-get install -y make \ m4 \ git \ wget \ unzip \ xz-utils \ curl \ lsb-core \ build-essential \ libreadline-dev \ ncurses-dev \ cmake \ gettext ```
- For CentOS, RedHat, and Fedora users, run the following commands.
-
Check if the GCC and cmake on your host are in the right version. See Software requirements for compiling Nebula Graph for the required versions.
$ g++ --version $ cmake --version
If your GCC and CMake are in the right version, then you are all set. If they are not, follow the sub-steps as follows.
1. Clone the nebula-common repository to your host.
```bash $ git clone https://github.com/vesoft-inc/nebula-common.git ``` The source code of Nebula Graph versions such as v2.0.0 is stored in particular branches. You can use the `--branch` or `-b` option to specify the branch to be cloned. For example, for 2.0.0, run the following command. ```bash $ git clone --branch v2.0.0 https://github.com/vesoft-inc/nebula-common.git ```
2. Make
nebula-common
the current working directory.```bash $ cd nebula-common ```
3. Run the following commands to install and enable CMake and GCC.
```bash // Install CMake. $ ./third-party/install-cmake.sh cmake-install // Enable CMake $ source cmake-install/bin/enable-cmake.sh // Install GCC. Installing GCC to /opt requires root privilege, you can change it to other locations. $ sudo ./third-party/install-gcc.sh --prefix=/opt // Enable GCC. $ source /opt/vesoft/toolset/gcc/7.5.0/enable ```
Requirements and suggestions for installing Nebula Graph in test environments¶
Hardware requirements for test environments¶
Item | Requirement |
---|---|
CPU architecture | x86_64 |
Number of CPU core | 4 |
Memory | 8 GB |
Disk | 100 GB, SSD |
Supported operating systems for test environments¶
For now, we can only install Nebula Graph in the Linux system. To install Nebula Graph in a test environment, we recommend that you use any Linux system with kernel version 3.9 or above.
Suggested service architecture for test environments¶
Process | Suggested number |
---|---|
metad (the metadata service process) | 1 |
storaged (the storage service process) | 1 or more |
graphd (the query engine service process) | 1 or more |
For example, for a single-machine environment, you can deploy 1 metad, 1 storaged, and 1 graphd processes in the machine.
For a more common environment, such as a cluster of 3 machines (named as A, B, and C), you can deploy Nebula Graph as follows:
Machine name | Number of metad | Number of storaged | Number of graphd |
---|---|---|---|
A | 1 | 1 | 1 |
B | None | 1 | 1 |
C | None | 1 | 1 |
Requirements and suggestions for installing Nebula Graph in production environments¶
Hardware requirements for production environments¶
Item | Requirement |
---|---|
CPU architecture | x86_64 |
Number of CPU core | 48 |
Memory | 96 GB |
Disk | 2 * 900 GB, NVMe SSD |
Supported operating systems for production environments¶
For now, we can only install Nebula Graph in the Linux system. To install Nebula Graph in a production environment, we recommend that you use any Linux system with kernel version 3.9 or above.
You can adjust some of the kernel parameters to better accommodate the need for running Nebula Graph. For more information, see kernel configuration.
Suggested service architecture for production environments¶
Process | Suggested number |
---|---|
metad (the metadata service process) | 3 |
storaged (the storage service process) | 3 or more |
graphd (the query engine service process) | 3 or more |
Each metad process automatically creates and maintains a copy of the metadata. Usually, you only need 3 metad processes. The number of storaged processes does not affect the number of graph space copies.
You can deploy multiple processes on a single machine. For example, on a cluster of 5 machines (named as A, B, C, D, and E), you can deploy Nebula Graph as follows:
WARNING: Do not deploy a cluster across IDCs.
Machine name | Number of metad | Number of storaged | Number of graphd |
---|---|---|---|
A | 1 | 1 | 1 |
B | 1 | 1 | 1 |
C | 1 | 1 | 1 |
D | None | 1 | 1 |
E | None | 1 | 1 |
Capacity requirements for running a Nebula Graph cluster¶
You can estimate the memory, disk space, and partition number needed for a Nebula Graph cluster of 3 replicas as follows.
Resource | Unit | How to estimate | Description |
---|---|---|---|
Disk space for a cluster | Bytes | the_sum_of_edge_number_and_vertex_number * average_bytes_of_attributes * 6 * 120% |
- |
Memory for a cluster | Bytes | [the_sum_of_edge_number_and_vertex_number * 15 + the_number_of_RocksDB_instances * (write_buffer_size * max_write_buffer_number ) + rocksdb_block_cache ] * 120% |
write_buffer_size and max_write_buffer_number are RocksDB parameters, for more information, see MemTable. For details about rocksdb_block_cache , see Memory usage in RocksDB. |
Number of partitions for a graph space | - | the_number_of_disks_in_the_cluster * disk_partition_num_multiplier |
disk_partition_num_multiplier is an integer between 2 and 10 (both including). It's value depends on the disk performance. Use 2 for HDD. |
- Question 1: Why do we multiply the disk space and memory by 120%?
Answer: The extra 20% is for buffer.
- Question 2: How to get the number of RocksDB instances?
Answer: Each directory in the
--data_path
item in theetc/nebula-storaged.conf
file corresponds to a RocksDB instance. Count the number of directories to get the RocksDB instance number.NOTE: You can decrease the memory size occupied by the bloom filter by adding
--enable_partitioned_index_filter=true
inetc/nebula-storaged.conf
. But it may decrease the read performance in some random-seek cases.
About storage devices¶
Nebula Graph is designed and implemented for NVMe SSD. All default parameters are optimized for the SSD devices.
Due to the poor IOPS capability and long random seek latency, HDD is not recommended. You may encounter many problems when using HDD.
And remote storage devices, such as NAS or SAN, are not recommended/tested as well.
Use local SSD device.