Self-healing¶
NebulaGraph Operator calls the interface provided by NebulaGraph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a NebulaGraph cluster stops running), NebulaGraph Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a NebulaGraph cluster.
Prerequisites¶
Steps¶
-
Create a NebulaGraph cluster. For more information, see Deploy NebulaGraph clusters with Kubectl or Deploy NebulaGraph clusters with Helm.
-
Delete the Pod named
<cluster_name>-storaged-2
after all pods are in theRunning
status.kubectl delete pod <cluster-name>-storaged-2 --now
<cluster_name>
is the name of your NebulaGraph cluster. -
NebulaGraph Operator automates the creation of the Pod named
<cluster-name>-storaged-2
to perform self-healing.Run the
kubectl get pods
command to check the status of the Pod<cluster-name>-storaged-2
.... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 0/1 ContainerCreating 0 1s ...
When the status of... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 1/1 Running 0 4m2s ...
<cluster-name>-storaged-2
is changed fromContainerCreating
toRunning
, the self-healing is performed successfully.