Query NebulaGraph metrics¶
NebulaGraph supports querying the monitoring metrics through HTTP ports.
Metrics structure¶
Each metric of NebulaGraph consists of three fields: name, type, and time range. The fields are separated by periods, for example, num_queries.sum.600. Different NebulaGraph services (Graph, Storage, or Meta) support different metrics. The detailed description is as follows.
| Field | Example | Description | 
|---|---|---|
| Metric name | num_queries | Indicates the function of the metric. | 
| Metric type | sum | Indicates how the metrics are collected. Supported types are SUM, AVG, RATE, and the P-th sample quantiles such as P75, P95, P99, and P99.9. | 
| Time range | 600 | The time range in seconds for the metric collection. Supported values are 5, 60, 600, and 3600, representing the last 5 seconds, 1 minute, 10 minutes, and 1 hour. | 
Space-level metrics¶
The Graph service supports a set of space-level metrics that record the information of different graph spaces separately.
To enable space-level metrics, set the value of enable_space_level_metrics to true in the Graph service configuration file before starting NebulaGraph. For details about how to modify the configuration, see Configuration Management.
Note
Space-level metrics can be queried only by querying all metrics. For example, run curl -G "http://192.168.8.40:19559/stats" to show all metrics. The returned result contains the graph space name in the form of '{space=space_name}', such as num_active_queries{space=basketballplayer}.sum.5=0.
Query metrics over HTTP¶
Syntax¶
curl -G "http://<ip>:<port>/stats?stats=<metric_name_list> [&format=json]"
| Parameter | Description | 
|---|---|
| ip | The IP address of the server. You can find it in the configuration file in the installation directory. | 
| port | The HTTP port of the server. You can find it in the configuration file in the installation directory. The default ports are 19559 (Meta), 19669 (Graph), and 19779 (Storage). | 
| metric_name_list | The metrics names. Multiple metrics are separated by commas (,). | 
| &format=json | Optional. Returns the result in the JSON format. | 
Note
If NebulaGraph is deployed with Docker Compose, run docker-compose ps to check the ports that are mapped from the service ports inside of the container and then query through them.
Examples¶
- 
Query a single metric Query the query number in the last 10 minutes in the Graph Service. $ curl -G "http://192.168.8.40:19669/stats?stats=num_queries.sum.600" num_queries.sum.600=400
- 
Query multiple metrics Query the following metrics together: - The average heartbeat latency in the last 1 minute.
 - 
The average latency of the slowest 1% heartbeats, i.e., the P99 heartbeats, in the last 10 minutes. $ curl -G "http://192.168.8.40:19559/stats?stats=heartbeat_latency_us.avg.60,heartbeat_latency_us.p99.600" heartbeat_latency_us.avg.60=281 heartbeat_latency_us.p99.600=985
 
- 
Return a JSON result. Query the number of new vertices in the Storage Service in the last 10 minutes and return the result in the JSON format. $ curl -G "http://192.168.8.40:19779/stats?stats=num_add_vertices.sum.600&format=json" [{"value":1,"name":"num_add_vertices.sum.600"}]
- 
Query all metrics in a service. If no metric is specified in the query, NebulaGraph returns all metrics in the service. $ curl -G "http://192.168.8.40:19559/stats" heartbeat_latency_us.avg.5=304 heartbeat_latency_us.avg.60=308 heartbeat_latency_us.avg.600=299 heartbeat_latency_us.avg.3600=285 heartbeat_latency_us.p75.5=652 heartbeat_latency_us.p75.60=669 heartbeat_latency_us.p75.600=651 heartbeat_latency_us.p75.3600=642 heartbeat_latency_us.p95.5=930 heartbeat_latency_us.p95.60=963 heartbeat_latency_us.p95.600=933 heartbeat_latency_us.p95.3600=929 heartbeat_latency_us.p99.5=986 heartbeat_latency_us.p99.60=1409 heartbeat_latency_us.p99.600=989 heartbeat_latency_us.p99.3600=986 num_heartbeats.rate.5=0 num_heartbeats.rate.60=0 num_heartbeats.rate.600=0 num_heartbeats.rate.3600=0 num_heartbeats.sum.5=2 num_heartbeats.sum.60=40 num_heartbeats.sum.600=394 num_heartbeats.sum.3600=2364 ...
Metric description¶
Graph¶
| Parameter | Description | 
|---|---|
| num_active_queries | The number of queries currently being executed. | 
| num_active_sessions | The number of currently active sessions. | 
| num_aggregate_executors | The number of executions for the Aggregation operator. | 
| num_auth_failed_sessions_bad_username_password | The number of sessions where authentication failed due to incorrect username and password. | 
| num_auth_failed_sessions_out_of_max_allowed | The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONSwas exceeded. | 
| num_auth_failed_sessions | The number of sessions in which login authentication failed. | 
| num_indexscan_executors | The number of executions for index scan operators. | 
| num_killed_queries | The number of killed queries. | 
| num_opened_sessions | The number of sessions connected to the server. | 
| num_queries | The number of queries. | 
| num_query_errors_leader_changes | The number of the raft leader changes due to query errors. | 
| num_query_errors | The number of query errors. | 
| num_reclaimed_expired_sessions | The number of expired sessions actively reclaimed by the server. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_storaged_failed | The number of failed RPC requests that the Graphd service sent to the Storaged service. | 
| num_rpc_sent_to_storaged | The number of RPC requests that the Graphd service sent to the Storaged service. | 
| num_sentences | The number of statements received by the Graphd service. | 
| num_slow_queries | The number of slow queries. | 
| num_sort_executors | The number of executions for the Sort operator. | 
| optimizer_latency_us | The latency of executing optimizer statements. | 
| query_latency_us | The average latency of queries. | 
| slow_query_latency_us | The average latency of slow queries. | 
| num_queries_hit_memory_watermark | The number of queries reached the memory watermark. | 
Meta¶
| Parameter | Description | 
|---|---|
| commit_log_latency_us | The latency of committing logs in Raft. | 
| commit_snapshot_latency_us | The latency of committing snapshots in Raft. | 
| heartbeat_latency_us | The latency of heartbeats. | 
| num_heartbeats | The number of heartbeats. | 
| num_raft_votes | The number of votes in Raft. | 
| transfer_leader_latency_us | The latency of transferring the raft leader. | 
| num_agent_heartbeats | The number of heartbeats for the AgentHBProcessor. | 
| agent_heartbeat_latency_us | The average latency of the AgentHBProcessor. | 
| replicate_log_latency_us | The latency of replicating the log record to most nodes by Raft. | 
| num_send_snapshot | The number of times that Raft sends snapshots to other nodes. | 
| append_log_latency_us | The latency of replicating the log record to a single node by Raft. | 
| append_wal_latency_us | The Raft write latency for a single WAL. | 
| num_grant_votes | The number of times that Raft votes for other nodes. | 
| num_start_elect | The number of times that Raft starts an election. | 
Storage¶
| Parameter | Description | 
|---|---|
| add_edges_atomic_latency_us | The average latency of adding edge single. | 
| add_edges_latency_us | The average latency of adding edges. | 
| add_vertices_latency_us | The average latency of adding vertices. | 
| commit_log_latency_us | The latency of committing logs in Raft. | 
| commit_snapshot_latency_us | The latency of committing snapshots in Raft. | 
| delete_edges_latency_us | The average latency of deleting edges. | 
| delete_vertices_latency_us | The average latency of deleting vertices. | 
| get_neighbors_latency_us | The average latency of querying neighbor vertices. | 
| num_get_prop | The number of executions for the GetPropProcessor. | 
| num_get_neighbors_errors | The number of execution errors for the GetNeighborsProcessor. | 
| get_prop_latency_us | The average latency of executions for the GetPropProcessor. | 
| num_edges_deleted | The number of deleted edges. | 
| num_edges_inserted | The number of inserted edges. | 
| num_raft_votes | The number of votes in Raft. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Storage service sent to the Meta service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Storaged service sent to the Metad service. | 
| num_tags_deleted | The number of deleted tags. | 
| num_vertices_deleted | The number of deleted vertices. | 
| num_vertices_inserted | The number of inserted vertices. | 
| transfer_leader_latency_us | The latency of transferring the raft leader. | 
| lookup_latency_us | The average latency of executions for the LookupProcessor. | 
| num_lookup_errors | The number of execution errors for the LookupProcessor. | 
| num_scan_vertex | The number of executions for the ScanVertexProcessor. | 
| num_scan_vertex_errors | The number of execution errors for the ScanVertexProcessor. | 
| update_edge_latency_us | The average latency of executions for the UpdateEdgeProcessor. | 
| num_update_vertex | The number of executions for the UpdateVertexProcessor. | 
| num_update_vertex_errors | The number of execution errors for the UpdateVertexProcessor. | 
| kv_get_latency_us | The average latency of executions for the Getprocessor. | 
| kv_put_latency_us | The average latency of executions for the PutProcessor. | 
| kv_remove_latency_us | The average latency of executions for the RemoveProcessor. | 
| num_kv_get_errors | The number of execution errors for the GetProcessor. | 
| num_kv_get | The number of executions for the GetProcessor. | 
| num_kv_put_errors | The number of execution errors for the PutProcessor. | 
| num_kv_put | The number of executions for the PutProcessor. | 
| num_kv_remove_errors | The number of execution errors for the RemoveProcessor. | 
| num_kv_remove | The number of executions for the RemoveProcessor. | 
| forward_tranx_latency_us | The average latency of transmission. | 
| scan_edge_latency_us | The average latency of executions for the ScanEdgeProcessor. | 
| num_scan_edge_errors | The number of execution errors for the ScanEdgeProcessor. | 
| num_scan_edge | The number of executions for the ScanEdgeProcessor. | 
| scan_vertex_latency_us | The latency of executions for the ScanVertexProcessor. | 
| num_add_edges | The number of times that edges are added. | 
| num_add_edges_errors | The number of errors when adding edges. | 
| num_add_vertices | The number of times that vertices are added. | 
| num_start_elect | The number of times that Raft starts an election. | 
| num_add_vertices_errors | The number of errors when adding vertices. | 
| num_delete_vertices_errors | The number of errors when deleting vertices. | 
| append_log_latency_us | The latency of replicating the log record to a single node by Raft. | 
| num_grant_votes | The number of times that Raft votes for other nodes. | 
| replicate_log_latency_us | The latency of replicating the log record to most nodes by Raft. | 
| num_delete_tags | The number of times that tags are deleted. | 
| num_delete_tags_errors | The number of errors when deleting tags. | 
| num_delete_edges | The number of edge deletions. | 
| num_delete_edges_errors | The number of errors when deleting edges | 
| num_send_snapshot | The number of times that snapshots are sent. | 
| update_vertex_latency_us | The latency of executions for the UpdateVertexProcessor. | 
| append_wal_latency_us | The Raft write latency for a single WAL. | 
| num_update_edge | The number of executions for the UpdateEdgeProcessor. | 
| delete_tags_latency_us | The average latency of deleting tags. | 
| num_update_edge_errors | The number of execution errors for the UpdateEdgeProcessor. | 
| num_get_neighbors | The number of executions for the GetNeighborsProcessor. | 
| num_get_prop_errors | The number of execution errors for the GetPropProcessor. | 
| num_delete_vertices | The number of times that vertices are deleted. | 
| num_lookup | The number of executions for the LookupProcessor. | 
| num_sync_data | The number of times the storage synchronizes data from drainer. | 
| num_sync_data_errors | The number of errors the storage synchronizes data from drainer. | 
Graph space¶
| Parameter | Description | 
|---|---|
| num_active_queries | The number of queries currently being executed. | 
| num_queries | The number of queries. | 
| num_sentences | The number of statements received by the Graphd service. | 
| optimizer_latency_us | The latency of executing optimizer statements. | 
| query_latency_us | The average latency of queries. | 
| num_slow_queries | The number of slow queries. | 
| num_query_errors | The number of query errors. | 
| num_query_errors_leader_changes | The number of raft leader changes due to query errors. | 
| num_killed_queries | The number of killed queries. | 
| num_aggregate_executors | The number of executions for the Aggregation operator. | 
| num_sort_executors | The number of executions for the Sort operator. | 
| num_indexscan_executors | The number of executions for index scan operators. | 
| num_oom_queries | The number of queries that caused memory to run out. | 
| num_auth_failed_sessions_bad_username_password | The number of sessions where authentication failed due to incorrect username and password. | 
| num_auth_failed_sessions | The number of sessions in which login authentication failed. | 
| num_opened_sessions | The number of sessions connected to the server. | 
| num_queries_hit_memory_watermark | The number of queries reached the memory watermark. | 
| num_reclaimed_expired_sessions | The number of expired sessions actively reclaimed by the server. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_storaged_failed | The number of failed RPC requests that the Graphd service sent to the Storaged service. | 
| num_rpc_sent_to_storaged | The number of RPC requests that the Graphd service sent to the Storaged service. | 
| slow_query_latency_us | The average latency of slow queries. |