$group (aggregation)
Definition
$group
- Groups input documents by the specified
_id
expression and foreach distinct grouping, outputs a document. The_id
field of eachoutput document contains the unique group by value. The outputdocuments can also contain computed fields that hold the values ofsome accumulator expression.
Note
$group
does not order its output documents.
The $group
stage has the following prototype form:
- {
- $group:
- {
- _id: <expression>, // Group By Expression
- <field1>: { <accumulator1> : <expression1> },
- ...
- }
- }
FieldDescriptionid
_Required. If you specify an id
value of null, or anyother constant value, the $group
stage calculatesaccumulated values for all the input documents as a whole.See example of Group by Null.field
_Optional. Computed using theaccumulator operators.
The _id
and the accumulator operatorscan accept any valid expression
. For more information onexpressions, see Expressions.
Considerations
Accumulator Operator
The <accumulator>
operator must be one of the following accumulatoroperators:
Name | Description |
---|---|
$addToSet | Returns an array of unique expression values for eachgroup. Order of the array elements is undefined. |
$avg | Returns an average of numerical values. Ignores non-numeric values. |
$first | Returns a value from the first document for each group. Orderis only defined if the documents are in a defined order. |
$last | Returns a value from the last document for each group. Orderis only defined if the documents are in a defined order. |
$max | Returns the highest expression value for each group. |
$mergeObjects | Returns a document created by combining the input documentsfor each group. |
$min | Returns the lowest expression value for each group. |
$push | Returns an array of expression values for each group. |
$stdDevPop | Returns the population standard deviation of the input values. |
$stdDevSamp | Returns the sample standard deviation of the input values. |
$sum | Returns a sum of numerical values. Ignores non-numeric values. |
$group Operator and Memory
The $group
stage has a limit of 100 megabytes of RAM. Bydefault, if the stage exceeds this limit, $group
returns anerror. To allow for the handling of large datasets, set theallowDiskUse
option totrue
. This flag enables $group
operations to write totemporary files. For more information, see thedb.collection.aggregate()
method and theaggregate
command.
Examples
Count the Number of Documents in a Collection
From the mongo
shell, create a sample collection namedsales
with the following documents:
- db.sales.insertMany([
- { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
- { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
- { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
- { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
- { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
- { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
- { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
- { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
- ])
The following aggregation operation uses the $group
stageto count the number of documents in the sales
collection:
- db.sales.aggregate( [
- {
- $group: {
- _id: null,
- count: { $sum: 1 }
- }
- }
- ] )
The operation returns the following result:
- { "_id" : null, "count" : 8 }
This aggregation operation is equivalent to the following SQL statement:
- SELECT COUNT(*) AS count FROM sales
See also
Retrieve Distinct Values
The following aggregation operation uses the $group
stageto retrieve the distinct item values from the sales
collection:
- db.sales.aggregate( [ { $group : { _id : "$item" } } ] )
The operation returns the following result:
- { "_id" : "abc" }
- { "_id" : "jkl" }
- { "_id" : "def" }
- { "_id" : "xyz" }
Group by Item Having
The following aggregation operation groups documents by the item
field, calculating the total sale amount per item and returning onlythe items with total sale amount greater than or equal to 100:
- db.sales.aggregate(
- [
- // First Stage
- {
- $group :
- {
- _id : "$item",
- totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
- }
- },
- // Second Stage
- {
- $match: { "totalSaleAmount": { $gte: 100 } }
- }
- ]
- )
- First Stage:
- The
$group
stage groups the documents byitem
toretrieve the distinct item values. This stage returns thetotalSaleAmount
for each item. - Second Stage:
- The
$match
stage filters the resulting documents to onlyreturn items with atotalSaleAmount
greater than or equal to 100.
The operation returns the following result:
- { "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
- { "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
- { "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }
This aggregation operation is equivalent to the following SQL statement:
- SELECT item,
- Sum(( price * quantity )) AS totalSaleAmount
- FROM sales
- GROUP BY item
- HAVING totalSaleAmount >= 100
See also
Calculate Count, Sum, and Average
From the mongo
shell, create a sample collection namedsales
with the following documents:
- db.sales.insertMany([
- { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
- { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
- { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
- { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
- { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
- { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
- { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
- { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
- ])
Group by Day of the Year
The following pipeline calculates the total sales amount, average salesquantity, and sale count for each day in the year 2014:
- db.sales.aggregate([
- // First Stage
- {
- $match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }
- },
- // Second Stage
- {
- $group : {
- _id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
- totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
- averageQuantity: { $avg: "$quantity" },
- count: { $sum: 1 }
- }
- },
- // Third Stage
- {
- $sort : { totalSaleAmount: -1 }
- }
- ])
- First Stage:
- The
$match
stage filters the documents to only passdocuments from the year 2014 to the next stage. - Second Stage:
- The
$group
stage groups the documents by date andcalculates the total sale amount, average quantity, and total count of thedocuments in each group. - Third Stage:
- The
$sort
stage sorts the results by the totalsale amount for each group in descending order.
The operation returns the following results:
- { "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }
- { "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }
- { "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }
This aggregation operation is equivalent to the following SQL statement:
- SELECT date,
- Sum(( price * quantity )) AS totalSaleAmount,
- Avg(quantity) AS averageQuantity,
- Count(*) AS Count
- FROM sales
- GROUP BY Date(date)
- ORDER BY totalSaleAmount DESC
See also
$match
$sort
db.collection.countDocuments()
which wraps the$group
aggregation stage with a$sum
expression.
Group by null
The following aggregation operation specifies a group id
ofnull
, calculating the total sale amount, average quantity, and count of_all documents in the collection.
- db.sales.aggregate([
- {
- $group : {
- _id : null,
- totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
- averageQuantity: { $avg: "$quantity" },
- count: { $sum: 1 }
- }
- }
- ])
The operation returns the following result:
- {
- "_id" : null,
- "totalSaleAmount" : NumberDecimal("452.5"),
- "averageQuantity" : 7.875,
- "count" : 8
- }
This aggregation operation is equivalent to the following SQL statement:
- SELECT Sum(price * quantity) AS totalSaleAmount,
- Avg(quantity) AS averageQuantity,
- Count(*) AS Count
- FROM sales
See also
$count
db.collection.countDocuments()
which wraps the$group
aggregation stage with a$sum
expression.
Pivot Data
From the mongo
shell, create a sample collection namedbooks
with the following documents:
- db.books.insertMany([
- { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
- { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
- { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
- { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
- { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
- ])
Group title by author
The following aggregation operation pivots the data in the books
collection to have titles grouped by authors.
- db.books.aggregate([
- { $group : { _id : "$author", books: { $push: "$title" } } }
- ])
The operation returns the following documents:
- { "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
- { "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }
Group Documents by author
The following aggregation operation groups documents by author
:
- db.books.aggregate([
- // First Stage
- {
- $group : { _id : "$author", books: { $push: "$$ROOT" } }
- },
- // Second Stage
- {
- $addFields:
- {
- totalCopies : { $sum: "$books.copies" }
- }
- }
- ])
- First Stage:
$group
uses the$$ROOT
system variable to group the entire documents by authors. This stagepasses the following documents to the next stage:
- { "_id" : "Homer",
- "books" :
- [
- { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
- { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
- ]
- },
- { "_id" : "Dante",
- "books" :
- [
- { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
- { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
- { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
- ]
- }
- Second Stage:
$addFields
adds a field to the output containingthe total copies of books for each author.
Note
The resulting documents must not exceed theBSON Document Size
limit of 16 megabytes.
The operation returns the following documents:
- {
- "_id" : "Homer",
- "books" :
- [
- { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
- { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
- ],
- "totalCopies" : 20
- }
- {
- "_id" : "Dante",
- "books" :
- [
- { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
- { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
- { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
- ],
- "totalCopies" : 5
- }
See also
Additional Resources
The Aggregation with the Zip Code Data Settutorial provides an extensive example of the $group
operator in a common use case.