Summarizing multiple columns at once is one of the most powerful features of Pandas — and the .agg() method is the cleanest way to do it. Instead of calling .mean(), .median(), etc. separately on each column, .agg() lets you apply one or more functions across multiple columns in a single, readable call.
In 2026, when datasets are larger and analysis pipelines are more complex, this method saves time and keeps your code maintainable. Here’s a practical guide with real examples.
1. Basic Multi-Column Summary
Apply the same set of functions to several columns — perfect for quick overviews.
import pandas as pd
data = {
'Name': ['John', 'Mary', 'Peter', 'Anna', 'Mike'],
'Age': [25, 32, 18, 47, 23],
'Salary': [50000, 80000, 35000, 65000, 45000]
}
df = pd.DataFrame(data)
# Same stats on multiple columns
summary = df[['Age', 'Salary']].agg(['mean', 'median', 'min', 'max', 'std'])
print(summary)
Output:
Age Salary
mean 29.0000 55000.0000
median 25.0000 50000.0000
min 18.0000 35000.0000
max 47.0000 80000.0000
std 9.0554 17888.5438
2. Different Functions per Column (Most Powerful Use)
Use a dictionary to assign specific aggregations to specific columns — this is where .agg() truly shines.
custom_summary = df.agg({
'Age': ['min', 'max', 'mean', 'std'],
'Salary': ['sum', 'mean', 'median']
})
print(custom_summary)
Output:
Age Salary
min 18.000000 35000.000000
max 47.000000 80000.000000
mean 29.000000 55000.000000
std 9.055385 17888.543819
sum NaN 275000.000000
median NaN 50000.000000
3. With groupby(): Grouped Multi-Column Summaries
Combine .agg() with groupby() for grouped reports — extremely common in real analysis.
# Add a department column for grouping
df['Department'] = ['Sales', 'Engineering', 'Sales', 'HR', 'Engineering']
# Grouped aggregation with different functions per column
dept_summary = df.groupby('Department').agg({
'Age': ['mean', 'min', 'max'],
'Salary': ['sum', 'mean', 'count']
})
print(dept_summary)
4. Named Aggregations (Clean, Readable Output)
Use NamedAgg for meaningful column names in the result (Python 3.6+).
from pandas import NamedAgg
named_summary = df.agg(
age_mean=('Age', 'mean'),
age_range=('Age', lambda x: x.max() - x.min()),
salary_total=('Salary', 'sum'),
salary_avg=('Salary', 'mean')
)
print(named_summary)
5. Custom Aggregation Functions
Define your own functions for domain-specific metrics (e.g., coefficient of variation).
def cv(x):
return x.std() / x.mean() if x.mean() != 0 else 0
advanced = df[['Age', 'Salary']].agg({
'Age': ['mean', cv],
'Salary': ['mean', 'sum', cv]
})
print(advanced)
6. Modern Alternative in 2026: Polars
For large datasets or speed-critical work, Polars is often faster and more memory-efficient.
import polars as pl
df_pl = pl.DataFrame(data)
summary_pl = df_pl.select([
pl.col("Age").mean().alias("age_mean"),
pl.col("Salary").sum().alias("salary_total"),
pl.col("Salary").std().alias("salary_std")
])
print(summary_pl)
Best Practices & Common Pitfalls
- Prefer dictionary syntax for different functions per column — much clearer
- Avoid mixing list and dict syntax in the same .agg() call
- Handle missing data before aggregation (fillna or dropna)
- Use reset_index() after groupby.agg() if you want a flat DataFrame
- For huge data, consider Polars or chunked processing
Conclusion
The .agg() method turns complex multi-column, multi-function summaries into clean, readable code — especially when paired with groupby(). In 2026, master .agg() with dictionaries, NamedAgg, and custom functions, and you'll write faster, clearer summaries that scale from quick exploration to production reporting.
Next time you need multiple statistics across columns — reach for .agg() first.