Aggregate Functions in Pandas: Beginner’s Guide with Examples
Aggregate functions in Pandas are one of the most crucial ideas to grasp when you first begin using Python for data analysis.
These functions facilitate the rapid summarization of large datasets, such as determining the average store sales, the total number of students’ grades, or the highest and lowest costs.
This guide will teach you how to use each aggregate function in Pandas with examples from a real dataset called score.csv.
You’ll be able to compute totals, averages, and a lot more at the end, both row-wise and column-wise.
The Dataset (score.csv)
Download Sample Dataset: score.csv
We’ll use a sample file called score.csv.
Let’s start with a basic dataset:
| Name | Math | Science | English | History |
|---|---|---|---|---|
| Alice | 78 | 82 | 75 | 68 |
| Bob | 85 | 79 | 80 | 74 |
| Charlie | 92 | 95 | 85 | 89 |
| David | 65 | 70 | 60 | 55 |
| Eva | 88 | 90 | 78 | 82 |
| Frank | 85 | 79 | 80 | 74 |
| Grace | 92 | 90 | 85 | 82 |
Students and their grades in various subjects are displayed in each row.
1. Sum Function in Pandas
Purpose:
All of the numerical values in a column or row are added together using the sum() function.
For instance, the sum of a single column
Output:
585
Here, Pandas sums up each student’s math grades.
It is the sum of all of the students’ math scores.
For instance: Total of Several Columns
Output:
| Math | Science |
|---|---|
| 585 | 585 |
This shows how many points all the students got in Math and Science.
Example: Sum Row-wise
This makes a new column called “Total” that shows each student’s total score in all subjects.
In real life:
- To find out how much money a company makes in a month, it adds up its daily sales.
- To figure out how well a student did, a teacher adds up their grades on assignments.
- To get the total cost of running a hospital, it adds up the costs from all of its departments.
2. Max Function in Pandas
Purpose:
The maximum value in a row or column is returned by the max() function.
For instance, the subject with the highest score
Output:
92
In math, Charlie and Grace had the highest scores.
Example: Maximum Value in Several Columns
Output:
| Math | Science |
|---|---|
| 92 | 95 |
You can see the highest score for each subject thanks to this.
Highest Value per Student, for instance
The best subject score for each student is now displayed in each row.
Real-world application:
- A retailer determines which day has the highest sales.
- Every athlete’s best performance during a match is determined by a sports coach.
- Each customer’s highest order value is determined by a business analyst.
3. Min Function in Pandas
Purpose:
The smallest value is found by the min() function.
For instance, the column’s minimum score
Output:
65
In math, David received the lowest score.
For instance: Minimum in Several Columns
Output:
| Math | Science |
|---|---|
| 65 | 70 |
For instance, the minimum amount per student
Each student’s weakest subject is now displayed in each row.
Real-world application:
- Each driver’s shortest delivery time is verified by a delivery company.
- The lowest monthly expense is determined by a finance team.
- To identify days with low traffic, a website keeps track of the number of visitors per day that is the lowest.
4. Count Function in Pandas
Purpose:
The count() function keeps track of how many entries are not null or empty.
For instance, count the values in a column.
Output:
7
The dataset contains seven students.
For instance, Count for Every Subject
Output:
| Math | Science | English | History |
|---|---|---|---|
| 7 | 7 | 7 | 7 |
This verifies that no marks are missing.
For instance, count row-wise
This demonstrates the number of subjects each student tried.
Usage in reality:
- Count the number of reviews that customers have submitted.
- Determine the number of available sales data days.
- In a week, count the number of employees who reported being present.
5. Mean Function in Pandas
Purpose:
By dividing the total value by the number of items, the mean() function determines the average.
For instance: Mean Scores in a Subject
Output:
| Math | Science |
|---|---|
| 83.57 | 83.57 |
Example: Average per Student
This creates a new column that displays the average score for each student across all subjects.
Use in real life:
- monthly sales average.
- Average score for each item.
- Average amount of time spent on each task.
- Average salary for each worker in a department.
Advice:
The mean is susceptible to outliers or extreme values. The mean may not accurately reflect the “typical” value if some numbers are significantly higher or lower than others.
6. Median Function in Pandas
Purpose:
Once all the numbers have been sorted, the median() function determines the middle value.
In contrast to the mean, it is less impacted by outliers.
For instance, a column’s median
Output:
85
Of the students, half received scores below 85, and the other half received scores above it.
For instance, the median for every student
As a result, a column displaying the middle score for every student is added.
Practical application:
- HR teams can better understand what most employees make by looking at the median salary.
- The median delivery time illustrates how long customers usually have to wait.
- Governments use the median income to gauge economic inequality.
7. Mode Function in Pandas
Purpose:
The value that appears the most frequently is found using the mode() function.
For instance, the Most Typical Math Grades
| Mode |
|---|
| 85 |
| 92 |
The Math column is where 85 and 92 are most frequently found.
Example: Each Student’s Mode
The mode indicates if a student received identical grades in several subjects.
Practical application:
- The most popular item offered for sale in an internet store.
- The most often provided customer feedback score.
- The average weekly temperature in a city.
8. Practical Assignments (Practice Exercises)
You can now apply everything you’ve learned to other datasets.
Column-wise (axis = 0)
1. Each column is treated separately – perfect for comparing subjects or time periods.
2. Find the total sales per quarter using .sum(axis=0)
3. Find the maximum and minimum sales with .max(axis=0) and .min(axis=0)4. Calculate the average and median using
.mean(axis=0) and .median(axis=0)5. Count the number of available entries using
.count(axis=0)6. Identify which quarter had the highest total sales
7. Calculate the range (max − min) for each column
Row-wise (axis = 1)
1. Each row represents an individual record, such as a student or store.
2. Find the total yearly sales per store using .sum(axis=1)3. Find the best and worst quarters with
.max(axis=1) and .min(axis=1)
4. Calculate the average quarterly sales using .mean(axis=1)
5. Find the median per store with .median(axis=1)
6. Count how many quarters each store reported data for .count(axis=1)
7. Identify which store achieved the highest total sales
8. All aggregate functions in pandas are important
Final Thoughts
Aggregate functions in Pandas are the foundation of all data analysis tasks.
They aid in condensing unstructured data into insightful conclusions.
1. Use sum to get totals
2. Using max and min to find extremes
3. Use count to measure how much data you have
4. Use mean, median, and mode to understand trends and patterns
5. By mastering these operations, you’ll handle most real-world data analysis problems confidently – whether in education, business, or research.
Internal & External Links
Internal link ideas:
-
Filtering in Pandas: Learn loc, iloc, isin(), and between()
Load Data in Pandas – A Complete Beginner’s Guide to Data ImportLearning Series Visit: ConsoleFlare Blog
Link once to Pandas official documentation:
https://pandas.pydata.org/docs/aggregate functions in pandas
Conclusion:
The fusion of data science in the finance sector is not just a technological evolution but also a fundamental shift in the way the financial industry operates. From predictive analytics to personalized financial services, the applications of data science are reshaping traditional practices and opening up new possibilities. As we all move forward, the synergy between finance and data science will continue to evolve, creating a more robust, efficient, and resilient financial ecosystem. In this data-driven era, those who embrace the power of data science will be at the forefront of innovations and success in the world of finance.
Want to know what else can be done by Data Science?
If you wish to learn more about data science or want to advance your career in the data science field, feel free to join our free workshop on Master’s in Data Science with Power BI, where you will get to know how exactly the data science field works and why companies are ready to pay handsome salaries in this field.
In this workshop, you will get to know each tool and technology from scratch, which will make you skillfully eligible for any data science profile.
To join this workshop, register yourself on ConsoleFlare, and we will call you back.
Thinking, Why Console Flare?
Recently, ConsoleFlare has been recognized as one of the Top 10 Most Promising Data Science Training Institutes of 2023.
Console Flare offers the opportunity to learn Data Science in Hindi, just like how you speak daily.
Console Flare believes in the idea of “What to learn and what not to learn,” and this can be seen in their curriculum structure. They have designed their program based on what you need to learn for data science and nothing else.
Want more reasons?
Register yourself on ConsoleFlare, and we will call you back.
Log in or sign up to view
See posts, photos, and more on Facebook.

