ADQL does not have an explicit GROUP BY clause. However, queries containing field names in the SELECT clause behave as GROUP BY statements if the query contains metric functions or bucketing functions

Group by queries are limited to containing up to five terms. Terms can be either fields (which use an implicit GROUP BY) or bucketing functions, such as series.

Aggregation Examples

In this example, the query groups the avg(responseTime) aggregation based on the non-aggregation field, application. So the result contains the average response time calculated for each application.

SELECT application, avg(responseTime) FROM transactions

Sample results:

The implicit grouping can be extended by using multiple non-aggregation fields preceding an aggregation field. For example:

SELECT application, segments.tier, avg(responseTime) FROM transactions

By adding tier as a second non-aggregation field, the query groups the avg(responseTime) aggregation based on the non-aggregation fields, application, and tierName. Because there are multiple grouping fields, the second non-aggregation field, tier name, is sub-grouped under the application grouping. So each combination of application and tier name together forms a group over which the average response time is calculated.

Sample results:

For SQL users, these two statements produce similar results in their respective environments:

  • ADQL query:  SELECT application, segments.tier, avg(responseTime) FROM transactions 
  • SQL query:  SELECT application, segments.tier, avg(responseTime) FROM transactions GROUP BY applicationName, segments.tier

Bucketing Function Example

Bucketing functions such as series and range define groups over which other aggregations can be calculated. The buckets are specified by the parameters of the respective functions and form groups over which subsequent aggregations are calculated.

For example, the following query calculates the average response time over two-minute intervals (or buckets):

SELECT series(eventTimestamp, '2m'), avg(responseTime)

Sample results:

You can also combine an aggregation with a bucketing function and implicit GROUP BY. The following query returns the average response time grouped by tier over the series of six-hour intervals.

SELECT series(eventTimestamp, '6h'), segments.tier, avg(responseTime) FROM transactions

Sample Results: