Aggregate Functions in Pandas: Beginner’s Guide with Examples

Aggregate Functions in Pandas: Beginner’s Guide with Examples

Aggregate Functions in Pandas: Beginner’s Guide with Examples

Aggregate functions in Pandas are one of the most crucial ideas to grasp when you first begin using Python for data analysis.
These functions facilitate the rapid summarization of large datasets, such as determining the average store sales, the total number of students’ grades, or the highest and lowest costs.

This guide will teach you how to use each aggregate function in Pandas with examples from a real dataset called score.csv.

You’ll be able to compute totals, averages, and a lot more at the end, both row-wise and column-wise.


The Dataset (score.csv)

Download Sample Dataset: score.csv

We’ll use a sample file called score.csv.

Let’s start with a basic dataset:

import pandas as pd
df = pd.read_csv('score.csv')
df
Name Math Science English History
Alice 78 82 75 68
Bob 85 79 80 74
Charlie 92 95 85 89
David 65 70 60 55
Eva 88 90 78 82
Frank 85 79 80 74
Grace 92 90 85 82

Students and their grades in various subjects are displayed in each row.


1. Sum Function in Pandas

Purpose:

All of the numerical values in a column or row are added together using the sum() function.

For instance, the sum of a single column

df['Math'].sum()

Output:

585

Here, Pandas sums up each student’s math grades.
It is the sum of all of the students’ math scores.

For instance: Total of Several Columns

df[['Math', 'Science']].sum()

Output:

Math Science
585 585

This shows how many points all the students got in Math and Science.

Example: Sum Row-wise

df['Total'] = df.iloc[:, 1:5].sum(axis=1)
df

This makes a new column called “Total” that shows each student’s total score in all subjects.

In real life:

  • To find out how much money a company makes in a month, it adds up its daily sales.
  • To figure out how well a student did, a teacher adds up their grades on assignments.
  • To get the total cost of running a hospital, it adds up the costs from all of its departments.

2. Max Function in Pandas

Purpose:

The maximum value in a row or column is returned by the max() function.

For instance, the subject with the highest score

df['Math'].max()

Output:

92

In math, Charlie and Grace had the highest scores.

Example: Maximum Value in Several Columns

df[['Math', 'Science']].max()

Output:

Math Science
92 95

You can see the highest score for each subject thanks to this.

Highest Value per Student, for instance

df['Max Score'] = df.iloc[:, 1:5].max(axis=1)

The best subject score for each student is now displayed in each row.

Real-world application:

  • A retailer determines which day has the highest sales.
  • Every athlete’s best performance during a match is determined by a sports coach.
  • Each customer’s highest order value is determined by a business analyst.

3. Min Function in Pandas

Purpose:

The smallest value is found by the min() function.

For instance, the column’s minimum score

df['Math'].min()

Output:

65

In math, David received the lowest score.

For instance: Minimum in Several Columns

df[['Math','Science']].min()

Output: 

Math Science
65 70

For instance, the minimum amount per student

df['Min Score'] = df.iloc[:, 1:5].min(axis=1)

Each student’s weakest subject is now displayed in each row.

Real-world application:

  • Each driver’s shortest delivery time is verified by a delivery company.
  • The lowest monthly expense is determined by a finance team.
  • To identify days with low traffic, a website keeps track of the number of visitors per day that is the lowest.

4. Count Function in Pandas

Purpose:

The count() function keeps track of how many entries are not null or empty.

For instance, count the values in a column.

df['Name'].count()

Output:

7

The dataset contains seven students.

For instance, Count for Every Subject

df.iloc[:, 1:5].count()

Output:

Math Science English History
7 7 7 7

This verifies that no marks are missing.

For instance, count row-wise

df['Total Exams'] = df.iloc[:, 1:5].count(axis=1)

This demonstrates the number of subjects each student tried.

Usage in reality:

  • Count the number of reviews that customers have submitted.
  • Determine the number of available sales data days.
  • In a week, count the number of employees who reported being present.

5. Mean Function in Pandas

Purpose:

By dividing the total value by the number of items, the mean() function determines the average.

For instance: Mean Scores in a Subject

df['Math'].mean()
Output:
83.57
This means that the average score in Math for the class is 83.57.

For example, the average of several columns

df[['Math','Science']].mean()

Output:

Math Science
83.57 83.57

Example: Average per Student

df['Average Marks'] = df.iloc[:, 1:5].mean(axis=1)

This creates a new column that displays the average score for each student across all subjects.

Use in real life:

  • monthly sales average.
  • Average score for each item.
  • Average amount of time spent on each task.
  • Average salary for each worker in a department.

Advice:

The mean is susceptible to outliers or extreme values. The mean may not accurately reflect the “typical” value if some numbers are significantly higher or lower than others.


6. Median Function in Pandas

Purpose:

Once all the numbers have been sorted, the median() function determines the middle value.
In contrast to the mean, it is less impacted by outliers.

For instance, a column’s median

df['Math'].median()

Output:

85

Of the students, half received scores below 85, and the other half received scores above it.

For instance, the median for every student

df['Median Score'] = df.iloc[:, 1:5].median(axis=1)

As a result, a column displaying the middle score for every student is added.

Practical application:

  • HR teams can better understand what most employees make by looking at the median salary.
  • The median delivery time illustrates how long customers usually have to wait.
  • Governments use the median income to gauge economic inequality.

7. Mode Function in Pandas

Purpose:
The value that appears the most frequently is found using the mode() function.

For instance, the Most Typical Math Grades

df['Math'].mode()
Output:
Mode
85
92

The Math column is where 85 and 92 are most frequently found.

Example: Each Student’s Mode

df.iloc[:, 1:5].mode(axis=1)

The mode indicates if a student received identical grades in several subjects.

Practical application:

  • The most popular item offered for sale in an internet store.
  • The most often provided customer feedback score.
  • The average weekly temperature in a city.

8. Practical Assignments (Practice Exercises)

You can now apply everything you’ve learned to other datasets.

Column-wise (axis = 0)

1. Each column is treated separately – perfect for comparing subjects or time periods.
2. Find the total sales per quarter using .sum(axis=0)
3. Find the maximum and minimum sales with .max(axis=0) and .min(axis=0)
4. Calculate the average and median using .mean(axis=0) and .median(axis=0)
5. Count the number of available entries using .count(axis=0)
6. Identify which quarter had the highest total sales
7. Calculate the range (max − min) for each column

Row-wise (axis = 1)

1. Each row represents an individual record, such as a student or store.
2. Find the total yearly sales per store using .sum(axis=1)
3. Find the best and worst quarters with .max(axis=1) and .min(axis=1)
4. Calculate the average quarterly sales using .mean(axis=1)
5. Find the median per store with .median(axis=1)
6. Count how many quarters each store reported data for .count(axis=1)
7. Identify which store achieved the highest total sales
8. All aggregate functions in pandas are important


Final Thoughts

Aggregate functions in Pandas are the foundation of all data analysis tasks.
They aid in condensing unstructured data into insightful conclusions.

1. Use sum to get totals
2. Using max and min to find extremes
3. Use count to measure how much data you have
4. Use mean, median, and mode to understand trends and patterns
5. By mastering these operations, you’ll handle most real-world data analysis problems confidently – whether in education, business, or research.

Internal & External Links

Internal link ideas:

Link once to Pandas official documentation:

https://pandas.pydata.org/docs/aggregate functions in pandas

Conclusion:

The fusion of data science in the finance sector is not just a technological evolution but also a fundamental shift in the way the financial industry operates. From predictive analytics to personalized financial services, the applications of data science are reshaping traditional practices and opening up new possibilities. As we all move forward, the synergy between finance and data science will continue to evolve, creating a more robust, efficient, and resilient financial ecosystem. In this data-driven era, those who embrace the power of data science will be at the forefront of innovations and success in the world of finance.

Want to know what else can be done by Data Science?

If you wish to learn more about data science or want to advance your career in the data science field, feel free to join our free workshop on Master’s in Data Science with Power BI, where you will get to know how exactly the data science field works and why companies are ready to pay handsome salaries in this field.

In this workshop, you will get to know each tool and technology from scratch, which will make you skillfully eligible for any data science profile.

To join this workshop, register yourself on ConsoleFlare, and we will call you back.

Thinking, Why Console Flare?

Recently, ConsoleFlare has been recognized as one of the Top 10 Most Promising Data Science Training Institutes of 2023.

Console Flare offers the opportunity to learn Data Science in Hindi, just like how you speak daily.
Console Flare believes in the idea of “What to learn and what not to learn,” and this can be seen in their curriculum structure. They have designed their program based on what you need to learn for data science and nothing else.

Want more reasons?

Register yourself on ConsoleFlare, and we will call you back.
Log in or sign up to view
See posts, photos, and more on Facebook.

Console Flare

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top