Skip to content

Restore data

NebulaGraph Operator supports restoring data from cloud storage services (such as GCS, AWS S3, Minio, etc.). This topic describes how to use NebulaGraph Operator to restore data.

Prerequisites

To restore data using NebulaGraph Operator, the following conditions must be met:

  • NebulaGraph Operator version is 1.8.0 or higher.
  • A NebulaGraph cluster is running on Kubernetes.
  • Access credentials for cloud storage services are prepared for data recovery.
  • Sufficient computing resources are available in the cluster to restore data.
  • In the cluster's YAML file, spec.enableBR is set to true and Agent is configured. For more information, see Create a cluster.

    Partial configuration of the cluster
    apiVersion: apps.nebula-graph.io/v1alpha1
    kind: NebulaCluster
    metadata:
      name: nebula
    spec:  
      enableBR: true # Set to true to enable the backup and recovery feature. The default value is false.
      agent:         # A component used for backup and recovery.
        env:         # Configure the certificate for Agent.
        - name: CA_CERT_PATH
          value: /usr/local/certs/root.crt
        - name: CLIENT_CERT_PATH
          value: /usr/local/certs/client.crt
        - name: CLIENT_KEY_PATH
          value: /usr/local/certs/client.key
        image: reg.vesoft-inc.com/cloud-dev/nebula-agent # Agent image address. The default value is vesoft/nebula-agent.
        version: latest                                  # Agent image version. The default value is latest.
        resources:                  
          requests:
            cpu: "100m"             # Minimum CPU usage.
            memory: "128Mi"         # Minimum memory usage.
          limits:
            cpu: "1"                # Maximum CPU usage.
            memory: "1Gi"           # Maximum memory usage.
      # Limit the speed of file upload and download, in Mbps. The default value is 0, indicating no limit.
      # rateLimit: 0
      # The connection timeout between the Agent and metad, in seconds. The default value is 60.
      # heartbeatInterval: 60
        volumeMounts:               # Mount path for the certificate.
        - mountPath: /usr/local/certs
          name: credentials   
    ...
    

Steps

The following example recovers data from a backup file named BACKUP_2023_02_12_10_04_16 and creates all resource objects in the default namespace default.

  1. Create a YAML file for data recovery, such as restore_file_name.yaml.

    apiVersion: v1
    kind: Secret                                           
    metadata:
      name: gcs-secret                                     # Name of the Secret for accessing the GCS storage service.
    type: Opaque
    data:
      credentials: <GOOGLE_APPLICATION_CREDENTIALS_JSON>
    ---
    apiVersion: apps.nebula-graph.io/v1alpha1
    kind: NebulaRestore
    metadata:
      name: restore1
    spec:
      config:
        clusterName: nebula                                # Name of the backup cluster. Data is restored based on this cluster and a new cluster is created. The name of the new cluster is automatically generated by the system.
        backupName: "BACKUP_2024_01_12_10_04_16"           # Name of the backup file. Data is restored based on this backup file.
        concurrency: 5                                     # Used to control the number of concurrent downloads of files during data recovery. The default value is 5.
        gs:                                              
          location: us-central1                            # Geographic region where the GCS storage service is located.
          bucket: "nebula-br-test"                         # Name of the GCS storage service bucket for storing backup data.
          secretName: "gcs-secret"                         # Name of the Secret for accessing the GCS storage service.
    
    apiVersion: v1
    kind: Secret                   
    metadata:
      name: aws-s3-secret                                   # Name of the Secret for accessing the S3 storage service.
    type: Opaque
    data:
      access_key: QVNJQVE0WFlxxx
      secret_key: ZFJ6OEdNcDdxenMwVGxxx
    ---
    apiVersion: apps.nebula-graph.io/v1alpha1
    kind: NebulaRestore
    metadata:
      name: restore1
    spec:
      br:
        clusterName: nebula                                 # Name of the backup cluster. Data is restored based on this cluster and a new cluster is created. The name of the new cluster is automatically generated by the system.
        backupName: "BACKUP_2023_02_12_10_04_16"            # Name of the backup file. Data is restored based on this backup file.
        concurrency: 5                                      # Used to control the number of concurrent downloads of files during data recovery. The default value is 5.
        s3:
          region: "us-west-2"                               # Geographic region where the S3 bucket is located.
          bucket: "nebula-br-test"                          # Name of the S3 storage service bucket for storing backup data.
          endpoint: "https://s3.us-west-2.amazonaws.com"    # Access address of the S3 storage service bucket.
          secretName: "aws-s3-secret"                       # Name of the Secret for accessing the S3 storage service.
    
  2. Start data recovery.

    kubectl create -f restore_file_name.yaml
    
  3. Check the status of the NebulaRestore object.

    kubectl get nr <NebulaRestore_name> 
    

    Output:

    NAME       STATUS     STARTED   COMPLETED   AGE
    restore1   Complete   67m       59m         67m
    
  4. Check the status of the new cluster.

    kubectl get nc
    

    Output:

    NAME     GRAPHD-DESIRED   GRAPHD-READY   METAD-DESIRED   METAD-READY   STORAGED-DESIRED   STORAGED-READY   AGE
    nebula   1                1              1               1             3                  3                2d3h
    ngxvsm   1                1              1               1             3                  3                92m  # 新集群
    

    After the restore job is completed, a new NebulaGraph cluster is created with the name automatically generated. The old cluster is not deleted, and you can decide whether to delete it. The name of the new cluster is automatically generated by the system.


Last update: March 6, 2024