$match and $group
By Satyam Singh Rajput
BCA (Cloud & Cybersecurity Specialization), SRI BALAJI UNIVERSITY, Pune
Introduction
MongoDB is a widely used NoSQL database known for its
flexibility and ease of use. One of its powerful features is the aggregation
framework, which helps in performing data analysis and transformations. As part
of my hands-on learning, I explored two essential aggregation operators: $match
and $group. This blog explains how I used them through a simple example.
What are $match and $group?
- $match
is used to filter documents based on specific conditions. It is similar to
the WHERE clause in SQL.
- $group
is used to group documents based on a specified field and apply aggregate
functions like sum, average, minimum, or maximum.
These two stages are often used together to first narrow
down the dataset and then summarize or analyze it.
Creating a Practical Dataset
To make this exercise more relevant, I created a fictional
dataset representing internship hours logged by students in different domains
(Cloud and Cybersecurity) and companies.
Sample Collection: internships
json
CopyEdit
[
{ "student":
"Satyam Singh Rajput", "domain": "Cloud", "hours":
30, "company": "AWS Educate" },
{ "student":
"Satyam Singh Rajput", "domain": "Cybersecurity",
"hours": 20, "company": "CodeFirst" },
{ "student":
"Ravi Verma", "domain": "Cloud", "hours":
25, "company": "AWS Educate" },
{ "student":
"Aarav Mehta", "domain": "Cybersecurity", "hours":
15, "company": "CodeFirst" },
{ "student":
"Satyam Singh Rajput", "domain": "Cloud", "hours":
10, "company": "CodeFirst" }
]
This structure allowed me to practice filtering and grouping
data in a realistic context.
Step-by-Step Execution in mongosh
Step 1: Create and Use Database
Bash
use satyamAggregationDB
Step 2: Insert the Documents
javascript
db.internships.insertMany([...]) // Use the dataset above
Step 3: Apply $match
Objective: Filter documents where the domain is
"Cloud".
javascript
db.internships.aggregate([
{ $match: { domain: "Cloud"
} }
])
This displays only the internship records related to the
Cloud domain.
Step 4: Apply $group
Objective: Calculate the total hours completed by
each student.
javascript
db.internships.aggregate([
{ $group: { _id: "$student",
totalHours: { $sum: "$hours" } } }
])
This summarizes the total internship hours per student,
across all domains.
Step 5: Combine $match and $group
Objective: Calculate internship hours per student for
the Cloud domain only.
javascript
db.internships.aggregate([
{ $match: { domain: "Cloud"
} },
{ $group: { _id: "$student",
cloudHours: { $sum: "$hours" } } }
])
This gives a clear view of each student’s total internship
hours in the Cloud domain.
Key Learnings
- $match
helps to filter and focus the dataset.
- $group
allows summarizing information in a meaningful way.
- Combining
both stages provides targeted analysis.
- MongoDB
makes it easy to perform analytical tasks with minimal setup and code.
Application in Real Projects
As a student specializing in Cloud and Cybersecurity, I see
practical uses of these operations in various areas, such as analyzing server
logs, processing IoT data, and generating reports from user activity.
Conclusion
Practicing with MongoDB’s aggregation pipeline has helped me
understand how data can be processed and summarized efficiently without
switching to another tool. This knowledge is essential for building backend
logic in cloud-based applications and real-time analytics systems.
Author:
Satyam Singh Rajput
BCA (Cloud & Cybersecurity Specialization)
SRI BALAJI UNIVERSITY, Pune.
1
ReplyDeleteNice 👍
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteInformative blog
ReplyDeleteExcellent
ReplyDeletebeautifully done... very easy to understand and useful information.
ReplyDeleteExcellent work 👍
ReplyDeleteNice blog
ReplyDeleteWell explained.!
ReplyDeleteExcellent Work 👍
ReplyDelete