Restore data¶

NebulaGraph Operator supports restoring data from cloud storage services (such as GCS, AWS S3, Minio, etc.). This topic describes how to use NebulaGraph Operator to restore data.

Prerequisites¶

To restore data using NebulaGraph Operator, the following conditions must be met:

NebulaGraph Operator version is 1.8.0 or higher.
A NebulaGraph cluster is running on Kubernetes.
Access credentials for cloud storage services are prepared for data recovery.
Sufficient computing resources are available in the cluster to restore data.

In the cluster's YAML file, spec.enableBR is set to true and Agent is configured. For more information, see Create a cluster.

Partial configuration of the cluster

apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaCluster
metadata:
  name: nebula
spec:  
  enableBR: true # Set to true to enable the backup and recovery feature. The default value is false.
  agent:         # A component used for backup and recovery.
    env:         # Configure the certificate for Agent.
    - name: CA_CERT_PATH
      value: /usr/local/certs/root.crt
    - name: CLIENT_CERT_PATH
      value: /usr/local/certs/client.crt
    - name: CLIENT_KEY_PATH
      value: /usr/local/certs/client.key
    image: reg.vesoft-inc.com/cloud-dev/nebula-agent # Agent image address. The default value is vesoft/nebula-agent.
    version: latest                                  # Agent image version. The default value is latest.
    resources:                  
      requests:
        cpu: "100m"             # Minimum CPU usage.
        memory: "128Mi"         # Minimum memory usage.
      limits:
        cpu: "1"                # Maximum CPU usage.
        memory: "1Gi"           # Maximum memory usage.
  # Limit the speed of file upload and download, in Mbps. The default value is 0, indicating no limit.
  # rateLimit: 0
  # The connection timeout between the Agent and metad, in seconds. The default value is 60.
  # heartbeatInterval: 60
    volumeMounts:               # Mount path for the certificate.
    - mountPath: /usr/local/certs
      name: credentials   
...

Steps¶

The following example recovers data from a backup file named BACKUP_2023_02_12_10_04_16 and creates all resource objects in the default namespace default.

Create a YAML file for data recovery, such as restore_file_name.yaml.

YAML content example for restoring data to GCSYAML content example for restoring data to S3-compatible storage services

apiVersion: v1
kind: Secret                                           
metadata:
  name: gcs-secret                                     # Name of the Secret for accessing the GCS storage service.
type: Opaque
data:
  credentials: <GOOGLE_APPLICATION_CREDENTIALS_JSON>
---
apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaRestore
metadata:
  name: restore1
spec:
  config:
    clusterName: nebula                                # Name of the backup cluster. Data is restored based on this cluster and a new cluster is created. The name of the new cluster is automatically generated by the system.
    backupName: "BACKUP_2024_01_12_10_04_16"           # Name of the backup file. Data is restored based on this backup file.
    concurrency: 5                                     # Used to control the number of concurrent downloads of files during data recovery. The default value is 5.
    gs:                                              
      location: us-central1                            # Geographic region where the GCS storage service is located.
      bucket: "nebula-br-test"                         # Name of the GCS storage service bucket for storing backup data.
      secretName: "gcs-secret"                         # Name of the Secret for accessing the GCS storage service.

apiVersion: v1
kind: Secret                   
metadata:
  name: aws-s3-secret                                   # Name of the Secret for accessing the S3 storage service.
type: Opaque
data:
  access_key: QVNJQVE0WFlxxx
  secret_key: ZFJ6OEdNcDdxenMwVGxxx
---
apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaRestore
metadata:
  name: restore1
spec:
  br:
    clusterName: nebula                                 # Name of the backup cluster. Data is restored based on this cluster and a new cluster is created. The name of the new cluster is automatically generated by the system.
    backupName: "BACKUP_2023_02_12_10_04_16"            # Name of the backup file. Data is restored based on this backup file.
    concurrency: 5                                      # Used to control the number of concurrent downloads of files during data recovery. The default value is 5.
    s3:
      region: "us-west-2"                               # Geographic region where the S3 bucket is located.
      bucket: "nebula-br-test"                          # Name of the S3 storage service bucket for storing backup data.
      endpoint: "https://s3.us-west-2.amazonaws.com"    # Access address of the S3 storage service bucket.
      secretName: "aws-s3-secret"                       # Name of the Secret for accessing the S3 storage service.

Start data recovery.

kubectl create -f restore_file_name.yaml

Check the status of the NebulaRestore object.

kubectl get nr <NebulaRestore_name>

Output:

NAME       STATUS     STARTED   COMPLETED   AGE
restore1   Complete   67m       59m         67m

Check the status of the new cluster.

kubectl get nc

Output:

NAME     GRAPHD-DESIRED   GRAPHD-READY   METAD-DESIRED   METAD-READY   STORAGED-DESIRED   STORAGED-READY   AGE
nebula   1                1              1               1             3                  3                2d3h
ngxvsm   1                1              1               1             3                  3                92m  # 新集群

After the restore job is completed, a new NebulaGraph cluster is created with the name automatically generated. The old cluster is not deleted, and you can decide whether to delete it. The name of the new cluster is automatically generated by the system.

Last update: March 6, 2024