Metrics¶
This topic will describe the monitoring metrics in NebulaGraph Dashboard.
Machine¶
Note
- All the machine metrics listed below are for the Linux operating system.
- The default unit in Disk and Network is byte. The unit will change with the data magnitude as the page displays. For example, when the flow is less than 1 KB/s, the unit will be Bytes/s.
- For versions of Dashboard Community Edition greater than v1.0.2, the memory occupied by Buff and Cache will not be counted in the memory usage.
CPU¶
| Parameter | Description | 
|---|---|
| cpu_utilization | The percentage of used CPU. | 
| cpu_idle | The percentage of idled CPU. | 
| cpu_wait | The percentage of CPU waiting for IO operations. | 
| cpu_user | The percentage of CPU used by users. | 
| cpu_system | The percentage of CPU used by the system. | 
Memory¶
| Parameter | Description | 
|---|---|
| memory_utilization | The percentage of used memory. | 
| memory_used | The memory space used (not including caches). | 
| memory_free | The memory space available. | 
Load¶
| Parameter | Description | 
|---|---|
| load_1m | The average load of the system in the last 1 minute. | 
| load_5m | The average load of the system in the last 5 minutes. | 
| load_15m | The average load of the system in the last 15 minutes. | 
Disk¶
| Parameter | Description | 
|---|---|
| disk_used_percentage | The disk utilization percentage. | 
| disk_used | The disk space used. | 
| disk_free | The disk space available. | 
| disk_readbytes | The number of bytes that the system reads in the disk per second. | 
| disk_writebytes | The number of bytes that the system writes in the disk per second. | 
| disk_readiops | The number of read queries that the disk receives per second. | 
| disk_writeiops | The number of write queries that the disk receives per second. | 
| inode_utilization | The percentage of used inode. | 
Network¶
| Parameter | Description | 
|---|---|
| network_in_rate | The number of bytes that the network card receives per second. | 
| network_out_rate | The number of bytes that the network card sends out per second. | 
| network_in_errs | The number of wrong bytes that the network card receives per second. | 
| network_out_errs | The number of wrong bytes that the network card sends out per second. | 
| network_in_packets | The number of data packages that the network card receives per second. | 
| network_out_packets | The number of data packages that the network card sends out per second. | 
Service¶
Period¶
The period is the time range of counting metrics. It currently supports 5 seconds, 60 seconds, 600 seconds, and 3600 seconds, which respectively represent the last 5 seconds, the last 1 minute, the last 10 minutes, and the last 1 hour.
Metric methods¶
| Parameter | Description | 
|---|---|
| rate | The average rate of operations per second in a period. | 
| sum | The sum of operations in the period. | 
| avg | The average latency in the cycle. | 
| P75 | The 75th percentile latency. | 
| P95 | The 95th percentile latency. | 
| P99 | The 99th percentile latency. | 
| P999 | The 99.9th percentile latency. | 
Note
Dashboard collects the following metrics from the NebulaGraph core, but only shows the metrics that are important to it.
Graph¶
| Parameter | Description | 
|---|---|
| num_active_queries | The number of changes in the number of active queries. Formula: The number of started queries minus the number of finished queries within a specified time. | 
| num_active_sessions | The number of changes in the number of active sessions. Formula: The number of logged in sessions minus the number of logged out sessions within a specified time. For example, when querying num_active_sessions.sum.5, if there were 10 sessions logged in and 30 sessions logged out in the last 5 seconds, the value of this metric is-20(10-30). | 
| num_aggregate_executors | The number of executions for the Aggregation operator. | 
| num_auth_failed_sessions_bad_username_password | The number of sessions where authentication failed due to incorrect username and password. | 
| num_auth_failed_sessions_out_of_max_allowed | The number of sessions that failed to authenticate logins because the value of the parameter FLAG_OUT_OF_MAX_ALLOWED_CONNECTIONSwas exceeded. | 
| num_auth_failed_sessions | The number of sessions in which login authentication failed. | 
| num_indexscan_executors | The number of executions for index scan operators. | 
| num_killed_queries | The number of killed queries. | 
| num_opened_sessions | The number of sessions connected to the server. | 
| num_queries | The number of queries. | 
| num_query_errors_leader_changes | The number of the raft leader changes due to query errors. | 
| num_query_errors | The number of query errors. | 
| num_reclaimed_expired_sessions | The number of expired sessions actively reclaimed by the server. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_storaged_failed | The number of failed RPC requests that the Graphd service sent to the Storaged service. | 
| num_rpc_sent_to_storaged | The number of RPC requests that the Graphd service sent to the Storaged service. | 
| num_sentences | The number of statements received by the Graphd service. | 
| num_slow_queries | The number of slow queries. | 
| num_sort_executors | The number of executions for the Sort operator. | 
| optimizer_latency_us | The latency of executing optimizer statements. | 
| query_latency_us | The latency of queries. | 
| slow_query_latency_us | The latency of slow queries. | 
| num_queries_hit_memory_watermark | The number of queries reached the memory watermark. | 
Meta¶
| Parameter | Description | 
|---|---|
| commit_log_latency_us | The latency of committing logs in Raft. | 
| commit_snapshot_latency_us | The latency of committing snapshots in Raft. | 
| heartbeat_latency_us | The latency of heartbeats. | 
| num_heartbeats | The number of heartbeats. | 
| num_raft_votes | The number of votes in Raft. | 
| transfer_leader_latency_us | The latency of transferring the raft leader. | 
| num_agent_heartbeats | The number of heartbeats for the AgentHBProcessor. | 
| agent_heartbeat_latency_us | The latency of the AgentHBProcessor. | 
| replicate_log_latency_us | The latency of replicating the log record to most nodes by Raft. | 
| num_send_snapshot | The number of times that Raft sends snapshots to other nodes. | 
| append_log_latency_us | The latency of replicating the log record to a single node by Raft. | 
| append_wal_latency_us | The Raft write latency for a single WAL. | 
| num_grant_votes | The number of times that Raft votes for other nodes. | 
| num_start_elect | The number of times that Raft starts an election. | 
Storage¶
| Parameter | Description | 
|---|---|
| add_edges_atomic_latency_us | The latency of adding atomic edge after TOSS is enabled. | 
| add_edges_latency_us | The latency of adding edges. | 
| add_vertices_latency_us | The latency of adding vertices. | 
| commit_log_latency_us | The latency of committing logs in Raft. | 
| commit_snapshot_latency_us | The latency of committing snapshots in Raft. | 
| delete_edges_latency_us | The latency of deleting edges. | 
| delete_vertices_latency_us | The latency of deleting vertices. | 
| get_neighbors_latency_us | The latency of querying neighbor vertices. | 
| num_get_prop | The number of executions for the GetPropProcessor. | 
| num_get_neighbors_errors | The number of execution errors for the GetNeighborsProcessor. | 
| get_prop_latency_us | The latency of executions for the GetPropProcessor. | 
| num_edges_deleted | The number of deleted edges. | 
| num_edges_inserted | The number of inserted edges. | 
| num_raft_votes | The number of votes in Raft. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Storage service sent to the Meta service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Storaged service sent to the Metad service. | 
| num_tags_deleted | The number of deleted tags. | 
| num_vertices_deleted | The number of deleted vertices. | 
| num_vertices_inserted | The number of inserted vertices. | 
| transfer_leader_latency_us | The latency of transferring the raft leader. | 
| lookup_latency_us | The latency of executions for the LookupProcessor. | 
| num_lookup_errors | The number of execution errors for the LookupProcessor. | 
| num_scan_vertex | The number of executions for the ScanVertexProcessor. | 
| num_scan_vertex_errors | The number of execution errors for the ScanVertexProcessor. | 
| update_edge_latency_us | The latency of executions for the UpdateEdgeProcessor. | 
| num_update_vertex | The number of executions for the UpdateVertexProcessor. | 
| num_update_vertex_errors | The number of execution errors for the UpdateVertexProcessor. | 
| kv_get_latency_us | The latency of executions for the Getprocessor. | 
| kv_put_latency_us | The latency of executions for the PutProcessor. | 
| kv_remove_latency_us | The latency of executions for the RemoveProcessor. | 
| num_kv_get_errors | The number of execution errors for the GetProcessor. | 
| num_kv_get | The number of executions for the GetProcessor. | 
| num_kv_put_errors | The number of execution errors for the PutProcessor. | 
| num_kv_put | The number of executions for the PutProcessor. | 
| num_kv_remove_errors | The number of execution errors for the RemoveProcessor. | 
| num_kv_remove | The number of executions for the RemoveProcessor. | 
| forward_tranx_latency_us | The latency of transmission. | 
| scan_edge_latency_us | The latency of executions for the ScanEdgeProcessor. | 
| num_scan_edge_errors | The number of execution errors for the ScanEdgeProcessor. | 
| num_scan_edge | The number of executions for the ScanEdgeProcessor. | 
| scan_vertex_latency_us | The latency of executions for the ScanVertexProcessor. | 
| num_add_edges | The number of times that edges are added. | 
| num_add_edges_errors | The number of errors when adding edges. | 
| num_add_vertices | The number of times that vertices are added. | 
| num_start_elect | The number of times that Raft starts an election. | 
| num_add_vertices_errors | The number of errors when adding vertices. | 
| num_delete_vertices_errors | The number of errors when deleting vertices. | 
| append_log_latency_us | The latency of replicating the log record to a single node by Raft. | 
| num_grant_votes | The number of times that Raft votes for other nodes. | 
| replicate_log_latency_us | The latency of replicating the log record to most nodes by Raft. | 
| num_delete_tags | The number of times that tags are deleted. | 
| num_delete_tags_errors | The number of errors when deleting tags. | 
| num_delete_edges | The number of edge deletions. | 
| num_delete_edges_errors | The number of errors when deleting edges | 
| num_send_snapshot | The number of times that snapshots are sent. | 
| update_vertex_latency_us | The latency of executions for the UpdateVertexProcessor. | 
| append_wal_latency_us | The Raft write latency for a single WAL. | 
| num_update_edge | The number of executions for the UpdateEdgeProcessor. | 
| delete_tags_latency_us | The latency of deleting tags. | 
| num_update_edge_errors | The number of execution errors for the UpdateEdgeProcessor. | 
| num_get_neighbors | The number of executions for the GetNeighborsProcessor. | 
| num_get_prop_errors | The number of execution errors for the GetPropProcessor. | 
| num_delete_vertices | The number of times that vertices are deleted. | 
| num_lookup | The number of executions for the LookupProcessor. | 
| num_sync_data | The number of times the Storage service synchronizes data from the Drainer. | 
| num_sync_data_errors | The number of errors that occur when the Storage service synchronizes data from the Drainer. | 
| sync_data_latency_us | The latency of the Storage service synchronizing data from the Drainer. | 
Graph space¶
| Parameter | Description | 
|---|---|
| num_active_queries | The number of queries currently being executed. | 
| num_queries | The number of queries. | 
| num_sentences | The number of statements received by the Graphd service. | 
| optimizer_latency_us | The latency of executing optimizer statements. | 
| query_latency_us | The latency of queries. | 
| num_slow_queries | The number of slow queries. | 
| num_query_errors | The number of query errors. | 
| num_query_errors_leader_changes | The number of raft leader changes due to query errors. | 
| num_killed_queries | The number of killed queries. | 
| num_aggregate_executors | The number of executions for the Aggregation operator. | 
| num_sort_executors | The number of executions for the Sort operator. | 
| num_indexscan_executors | The number of executions for index scan operators. | 
| num_oom_queries | The number of queries that caused memory to run out. | 
| num_auth_failed_sessions_bad_username_password | The number of sessions where authentication failed due to incorrect username and password. | 
| num_auth_failed_sessions | The number of sessions in which login authentication failed. | 
| num_opened_sessions | The number of sessions connected to the server. | 
| num_queries_hit_memory_watermark | The number of queries reached the memory watermark. | 
| num_reclaimed_expired_sessions | The number of expired sessions actively reclaimed by the server. | 
| num_rpc_sent_to_metad_failed | The number of failed RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_metad | The number of RPC requests that the Graphd service sent to the Metad service. | 
| num_rpc_sent_to_storaged_failed | The number of failed RPC requests that the Graphd service sent to the Storaged service. | 
| num_rpc_sent_to_storaged | The number of RPC requests that the Graphd service sent to the Storaged service. | 
| slow_query_latency_us | The average latency of slow queries. | 
  
    
      Last update:
      February 19, 2024