Self-healing¶
Nebula Operator calls the interface provided by Nebula Graph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a Nebula Graph cluster stops running), Nebula Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a Nebula Graph cluster.
Prerequisites¶
Steps¶
-
Create a Nebula Graph cluster. For more information, see Deploy Nebula Graph clusters with Kubectl or Deploy Nebula Graph clusters with Helm.
-
Delete the Pod named
<cluster_name>-storaged-2
after all pods are in theRunning
status.kubectl delete pod <cluster-name>-storaged-2 --now
<cluster_name>
is the name of your Nebula Graph cluster. -
Nebula Operator automates the creation of the Pod named
<cluster-name>-storaged-2
to perform self-healing.Run the
kubectl get pods
command to check the status of the Pod<cluster-name>-storaged-2
.... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 0/1 ContainerCreating 0 1s ...
When the status of... nebula-cluster-storaged-1 1/1 Running 0 5d23h nebula-cluster-storaged-2 1/1 Running 0 4m2s ...
<cluster-name>-storaged-2
is changed fromContainerCreating
toRunning
, the self-healing is performed successfully.