Balance storage data after scaling out¶

Enterpriseonly

This feature is for NebulaGraph Enterprise Edition only.

After the Storage service is scaled out, you can decide whether to balance the data in the Storage service.

The scaling out of the NebulaGraph's Storage service is divided into two stages. In the first stage, the status of all pods is changed to Ready. In the second stage, the commands of BALANCE DATA and BALANCE LEADER are executed to balance data. These two stages decouple the scaling out process of the controller replica from the balancing data process, so that you can choose to perform the data balancing operation during low traffic period. The decoupling of the scaling out process from the balancing process can effectively reduce the impact on online services during data migration.

You can define whether to balance data automatically or not with the parameter enableAutoBalance in the configuration file of the CR instance of the cluster you created.

Prerequisites¶

You have created a NebulaGraph cluster. For how to create a cluster with Kubectl, see Create a cluster with Kubectl.

Steps¶

The following example uses a cluster named nebula and the cluster's configuration file named nebula_cluster.yaml to show how to set enableAutoBalance.

Run the following command to access the edit page of the nebula cluster.
```
kubectl edit nebulaclusters.apps.nebula-graph.io nebula
```

Add enableAutoBalance and set its value to true under spec.storaged.

apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaCluster
metadata:
  name: nebula
spec:
  graphd:
    image: vesoft/nebula-graphd
    logVolumeClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks
    replicas: 1
    resources:
      limits:
        cpu: "1"
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 500Mi
    version: v3.1.0
  imagePullPolicy: IfNotPresent
  metad:
    dataVolumeClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks
    image: vesoft/nebula-metad
    logVolumeClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks
    replicas: 1
    resources:
      limits:
        cpu: "1"
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 500Mi
    version: v3.1.0
  nodeSelector:
    nebula: cloud
  reference:
    name: statefulsets.apps
    version: v1
  schedulerName: default-scheduler
  storaged:
    enableAutoBalance: true   //Set its value to true which means storage data will be balanced after the Storage service is scaled out.
    dataVolumeClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks
    image: vesoft/nebula-storaged
    logVolumeClaim:
      resources:
        requests:
          storage: 2Gi
      storageClassName: fast-disks
    replicas: 3
    resources:
      limits:
        cpu: "1"
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 500Mi
    version: v3.1.0
...

When the value of enableAutoBalance is set to true, the Storage data will be automatically balanced after the Storage service is scaled out.

When the value of enableAutoBalance is set to false, the Storage data will not be automatically balanced after the Storage service is scaled out.

When the enableAutoBalance parameter is not set, the system will not automatically balance Storage data by default after the Storage service is scaled out.

Run kubectl apply -f nebula_cluster.yaml to push your configuration changes to the cluster.

Last update: March 13, 2023