Aggregations and Analytics
Series: Elasticsearch 101 | Article: 05
Overview
Aggregations are Elasticsearch's answer to SQL GROUP BY, COUNT, SUM, AVG, and histogram queries. Unlike those SQL equivalents, aggregations run in the same request as a search query — you can simultaneously retrieve the top 10 matching documents and build the facet counts for every filter dimension without a second round trip.
This article covers the aggregation types I use most often and how to combine them with queries.
Aggregation Structure
Aggregations live under the top-level aggs key (alias: aggregations). Each aggregation has a user-defined name and a type:
{
"query": { ... },
"aggs": {
"<agg-name>": {
"<agg-type>": { ... }
}
}
}Aggregations fall into three families:
Bucket
Group documents into buckets (categories, ranges, date histograms)
Metric
Compute numeric values over documents in a bucket (avg, sum, min, max)
Pipeline
Compute values from the output of other aggregations
Metric Aggregations
Metric aggregations compute a single value from the documents in scope.
value_count
value_countCount documents that have a non-null value for a field:
Setting
"size": 0tells Elasticsearch not to return document hits — we only want the aggregation result.
avg, sum, min, max
avg, sum, min, maxstats
statsReturns count, min, max, avg, and sum in one aggregation:
Bucket Aggregations
Bucket aggregations group documents into buckets. Metrics can nest inside buckets.
terms
termsGroup by distinct values of a keyword field — the most common aggregation for facets:
Response:
size controls how many buckets are returned (not document count). The default is 10. For tag clouds or full category lists, you may need size: 100 or more — but large terms aggregations are expensive, so size them deliberately.
date_histogram
date_histogramGroup by time intervals — useful for activity charts:
calendar_interval options: minute, hour, day, week, month, quarter, year. For fixed intervals (every 7 days), use fixed_interval: "7d" instead.
min_doc_count: 0 includes months with zero articles — useful for complete time series in charts.
range
rangeGroup documents into explicit numeric ranges:
filter aggregation
filter aggregationCompute a metric over a specific subset of documents, regardless of the main query:
Nested Aggregations
Aggregations nest by placing an aggs key inside a bucket aggregation. This is how you build "top tags by average view count":
Response shape:
Combining Search + Aggregations
This is the core pattern for faceted search. A single request returns:
Matching documents (for display)
Facet counts (for the sidebar filters)
Any analytics you need
One request, three pieces of data. No database joins, no sequential queries.
Global Aggregation
By default, aggregations are scoped to the documents matched by the query. Use global to aggregate over the entire index, regardless of the current query:
This gives you the average view count for the current filtered result set alongside the global average — useful for showing "this set vs all" comparisons.
Cardinality Aggregation
Count distinct values — the approximate equivalent of SELECT COUNT(DISTINCT field):
Cardinality is approximate (uses HyperLogLog++ internally). The error rate is configurable via precision_threshold — higher precision costs more memory. The default precision is sufficient for most UI use cases.
Pipeline Aggregations
Pipeline aggregations operate on the output of other aggregations rather than documents. A common use case is computing a moving average or cumulative sum over a date histogram:
Performance Notes
Aggregations on
keywordfields are cached efficiently. Aggregations ontextfields requirefielddata: truein the mapping — avoid this, it uses heap memory per field.Large
termsaggregations with high cardinality (e.g., aggregating onuser_idacross millions of users) are expensive. Usecardinalityfor counts and limit bucketsize.Date histograms are generally cheap because dates parse predictably.
Run aggregation-heavy queries with
"size": 0to skip fetching and scoring documents when you only need the aggregation result.
Summary
Bucket aggregations group documents; metric aggregations compute values; pipelines compute across bucket results.
termsis the standard facet aggregation onkeywordfields.date_histogrampowers time-series charts.Nest metrics inside buckets to get "per-category stats" in a single query.
Combine search queries and aggregations in one request for faceted search.
Use
"size": 0when you only need aggregation results.
Previous: Search Queries Deep Dive | Next: Go Backend Integration
Last updated