Aggregating functions¶
This topic describes the aggregating functions supported by NebulaGraph.
avg()¶
avg() returns the average value of the argument.
Syntax: avg(<expression>)
- Result type: Double
Example:
nebula> MATCH (v:player) RETURN avg(v.player.age);
+--------------------+
| avg(v.player.age) |
+--------------------+
| 33.294117647058826 |
+--------------------+
count()¶
count() returns the number of records.
- (Native nGQL) You can use
count()
andGROUP BY
together to group and count the number of parameters. UseYIELD
to return.
- (OpenCypher style) You can use
count()
andRETURN
.GROUP BY
is not necessary.
Syntax: count({<expression> | *})
- count(*) returns the number of rows (including NULL).
- Result type: Int
Example:
nebula> WITH [NULL, 1, 1, 2, 2] As a UNWIND a AS b \
RETURN count(b), count(*), count(DISTINCT b);
+----------+----------+-------------------+
| count(b) | count(*) | count(distinct b) |
+----------+----------+-------------------+
| 4 | 5 | 2 |
+----------+----------+-------------------+
# The statement in the following example searches for the people whom `player101` follows and people who follow `player101`, i.e. a bidirectional query.
# Group and count the number of parameters.
nebula> GO FROM "player101" OVER follow BIDIRECT \
YIELD properties($$).name AS Name \
| GROUP BY $-.Name YIELD $-.Name, count(*);
+---------------------+----------+
| $-.Name | count(*) |
+---------------------+----------+
| "LaMarcus Aldridge" | 2 |
| "Tim Duncan" | 2 |
| "Marco Belinelli" | 1 |
| "Manu Ginobili" | 1 |
| "Boris Diaw" | 1 |
| "Dejounte Murray" | 1 |
+---------------------+----------+
# Count the number of parameters.
nebula> MATCH (v1:player)-[:follow]-(v2:player) \
WHERE id(v1)== "player101" \
RETURN v2.player.name AS Name, count(*) as cnt ORDER BY cnt DESC;
+---------------------+-----+
| Name | cnt |
+---------------------+-----+
| "LaMarcus Aldridge" | 2 |
| "Tim Duncan" | 2 |
| "Boris Diaw" | 1 |
| "Manu Ginobili" | 1 |
| "Dejounte Murray" | 1 |
| "Marco Belinelli" | 1 |
+---------------------+-----+
The preceding example retrieves two columns:
$-.Name
: the names of the people.
count(*)
: how many times the names show up.
Because there are no duplicate names in the basketballplayer
dataset, the number 2
in the column count(*)
shows that the person in that row and player101
have followed each other.
# a: The statement in the following example retrieves the age distribution of the players in the dataset.
nebula> LOOKUP ON player \
YIELD player.age As playerage \
| GROUP BY $-.playerage \
YIELD $-.playerage as age, count(*) AS number \
| ORDER BY $-.number DESC, $-.age DESC;
+-----+--------+
| age | number |
+-----+--------+
| 34 | 4 |
| 33 | 4 |
| 30 | 4 |
| 29 | 4 |
| 38 | 3 |
+-----+--------+
...
# b: The statement in the following example retrieves the age distribution of the players in the dataset.
nebula> MATCH (n:player) \
RETURN n.player.age as age, count(*) as number \
ORDER BY number DESC, age DESC;
+-----+--------+
| age | number |
+-----+--------+
| 34 | 4 |
| 33 | 4 |
| 30 | 4 |
| 29 | 4 |
| 38 | 3 |
+-----+--------+
...
# The statement in the following example counts the number of edges that Tim Duncan relates.
nebula> MATCH (v:player{name:"Tim Duncan"}) -[e]- (v2) \
RETURN count(e);
+----------+
| count(e) |
+----------+
| 13 |
+----------+
# The statement in the following example counts the number of edges that Tim Duncan relates and returns two columns (no DISTINCT and DISTINCT) in multi-hop queries.
nebula> MATCH (n:player {name : "Tim Duncan"})-[]->(friend:player)-[]->(fof:player) \
RETURN count(fof), count(DISTINCT fof);
+------------+---------------------+
| count(fof) | count(distinct fof) |
+------------+---------------------+
| 4 | 3 |
+------------+---------------------+
max()¶
max() returns the maximum value.
Syntax: max(<expression>)
- Result type: Same as the original argument.
Example:
nebula> MATCH (v:player) RETURN max(v.player.age);
+-------------------+
| max(v.player.age) |
+-------------------+
| 47 |
+-------------------+
min()¶
min() returns the minimum value.
Syntax: min(<expression>)
- Result type: Same as the original argument.
Example:
nebula> MATCH (v:player) RETURN min(v.player.age);
+-------------------+
| min(v.player.age) |
+-------------------+
| 20 |
+-------------------+
collect()¶
collect() returns a list containing the values returned by an expression. Using this function aggregates data by merging multiple records or values into a single list.
Syntax: collect(<expression>)
- Result type: List
Example:
nebula> UNWIND [1, 2, 1] AS a \
RETURN a;
+---+
| a |
+---+
| 1 |
| 2 |
| 1 |
+---+
nebula> UNWIND [1, 2, 1] AS a \
RETURN collect(a);
+------------+
| collect(a) |
+------------+
| [1, 2, 1] |
+------------+
nebula> UNWIND [1, 2, 1] AS a \
RETURN a, collect(a), size(collect(a));
+---+------------+------------------+
| a | collect(a) | size(collect(a)) |
+---+------------+------------------+
| 2 | [2] | 1 |
| 1 | [1, 1] | 2 |
+---+------------+------------------+
# The following examples sort the results in descending order, limit output rows to 3, and collect the output into a list.
nebula> UNWIND ["c", "b", "a", "d" ] AS p \
WITH p AS q \
ORDER BY q DESC LIMIT 3 \
RETURN collect(q);
+-----------------+
| collect(q) |
+-----------------+
| ["d", "c", "b"] |
+-----------------+
nebula> WITH [1, 1, 2, 2] AS coll \
UNWIND coll AS x \
WITH DISTINCT x \
RETURN collect(x) AS ss;
+--------+
| ss |
+--------+
| [1, 2] |
+--------+
nebula> MATCH (n:player) \
RETURN collect(n.player.age);
+---------------------------------------------------------------+
| collect(n.player.age) |
+---------------------------------------------------------------+
| [32, 32, 34, 29, 41, 40, 33, 25, 40, 37, ...
...
# The following example aggregates all the players' names by their ages.
nebula> MATCH (n:player) \
RETURN n.player.age AS age, collect(n.player.name);
+-----+--------------------------------------------------------------------------+
| age | collect(n.player.name) |
+-----+--------------------------------------------------------------------------+
| 24 | ["Giannis Antetokounmpo"] |
| 20 | ["Luka Doncic"] |
| 25 | ["Joel Embiid", "Kyle Anderson"] |
+-----+--------------------------------------------------------------------------+
...
nebula> GO FROM "player100" OVER serve \
YIELD properties($$).name AS name \
| GROUP BY $-.name \
YIELD collect($-.name) AS name;
+-----------+
| name |
+-----------+
| ["Spurs"] |
+-----------+
nebula> LOOKUP ON player \
YIELD player.age As playerage \
| GROUP BY $-.playerage \
YIELD collect($-.playerage) AS playerage;
+------------------+
| playerage |
+------------------+
| [22] |
| [47] |
| [43] |
| [25, 25] |
+------------------+
...
std()¶
std() returns the population standard deviation.
Syntax: std(<expression>)
- Result type: Double
Example:
nebula> MATCH (v:player) RETURN std(v.player.age);
+-------------------+
| std(v.player.age) |
+-------------------+
| 6.423895701687502 |
+-------------------+
sum()¶
sum() returns the sum value.
Syntax: sum(<expression>)
- Result type: Same as the original argument.
Example:
nebula> MATCH (v:player) RETURN sum(v.player.age);
+-------------------+
| sum(v.player.age) |
+-------------------+
| 1698 |
+-------------------+
Aggregating example¶
nebula> GO FROM "player100" OVER follow YIELD dst(edge) AS dst, properties($$).age AS age \
| GROUP BY $-.dst \
YIELD \
$-.dst AS dst, \
toInteger((sum($-.age)/count($-.age)))+avg(distinct $-.age+1)+1 AS statistics;
+-------------+------------+
| dst | statistics |
+-------------+------------+
| "player125" | 84.0 |
| "player101" | 74.0 |
+-------------+------------+