Nebula Operator calls the interface provided by Nebula Graph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a Nebula Graph cluster stops running), Nebula Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a Nebula Graph cluster.
Delete the Pod named
<cluster_name>-storaged-2after all pods are in the
kubectl delete pod <cluster-name>-storaged-2 --now
<cluster_name>is the name of your Nebula Graph cluster.
Nebula Operator automates the creation of the Pod named
<cluster-name>-storaged-2to perform self-healing.
kubectl get podscommand to check the status of the Pod
... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 0/1 ContainerCreating 0 1s ...When the status of
... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 1/1 Running 0 4m2s ...
<cluster-name>-storaged-2is changed from
Running, the self-healing is performed successfully.