Summaries on multiple columns

Summarizing multiple columns at once is one of the most powerful features of Pandas — and the .agg() method is the cleanest way to do it. Instead of calling .mean(), .median(), etc. separately on each column, .agg() lets you apply one or more functions across multiple columns in a single, readable call.

In 2026, when datasets are larger and analysis pipelines are more complex, this method saves time and keeps your code maintainable. Here’s a practical guide with real examples.

1. Basic Multi-Column Summary

Apply the same set of functions to several columns — perfect for quick overviews.


import pandas as pd

data = {
    'Name': ['John', 'Mary', 'Peter', 'Anna', 'Mike'],
    'Age': [25, 32, 18, 47, 23],
    'Salary': [50000, 80000, 35000, 65000, 45000]
}

df = pd.DataFrame(data)

# Same stats on multiple columns
summary = df[['Age', 'Salary']].agg(['mean', 'median', 'min', 'max', 'std'])

print(summary)

Output:


             Age        Salary
mean     29.0000     55000.0000
median   25.0000     50000.0000
min      18.0000     35000.0000
max      47.0000     80000.0000
std       9.0554     17888.5438

2. Different Functions per Column (Most Powerful Use)

Use a dictionary to assign specific aggregations to specific columns — this is where .agg() truly shines.


custom_summary = df.agg({
    'Age': ['min', 'max', 'mean', 'std'],
    'Salary': ['sum', 'mean', 'median']
})

print(custom_summary)

Output:


           Age        Salary
min    18.000000  35000.000000
max    47.000000  80000.000000
mean   29.000000  55000.000000
std     9.055385  17888.543819
sum          NaN 275000.000000
median       NaN  50000.000000

3. With groupby(): Grouped Multi-Column Summaries

Combine .agg() with groupby() for grouped reports — extremely common in real analysis.


# Add a department column for grouping
df['Department'] = ['Sales', 'Engineering', 'Sales', 'HR', 'Engineering']

# Grouped aggregation with different functions per column
dept_summary = df.groupby('Department').agg({
    'Age': ['mean', 'min', 'max'],
    'Salary': ['sum', 'mean', 'count']
})

print(dept_summary)

4. Named Aggregations (Clean, Readable Output)

Use NamedAgg for meaningful column names in the result (Python 3.6+).


from pandas import NamedAgg

named_summary = df.agg(
    age_mean=('Age', 'mean'),
    age_range=('Age', lambda x: x.max() - x.min()),
    salary_total=('Salary', 'sum'),
    salary_avg=('Salary', 'mean')
)

print(named_summary)

5. Custom Aggregation Functions

Define your own functions for domain-specific metrics (e.g., coefficient of variation).


def cv(x):
    return x.std() / x.mean() if x.mean() != 0 else 0

advanced = df[['Age', 'Salary']].agg({
    'Age': ['mean', cv],
    'Salary': ['mean', 'sum', cv]
})

print(advanced)

6. Modern Alternative in 2026: Polars

For large datasets or speed-critical work, Polars is often faster and more memory-efficient.


import polars as pl

df_pl = pl.DataFrame(data)
summary_pl = df_pl.select([
    pl.col("Age").mean().alias("age_mean"),
    pl.col("Salary").sum().alias("salary_total"),
    pl.col("Salary").std().alias("salary_std")
])
print(summary_pl)

Best Practices & Common Pitfalls

Prefer dictionary syntax for different functions per column — much clearer
Avoid mixing list and dict syntax in the same .agg() call
Handle missing data before aggregation (fillna or dropna)
Use reset_index() after groupby.agg() if you want a flat DataFrame
For huge data, consider Polars or chunked processing

Conclusion

The .agg() method turns complex multi-column, multi-function summaries into clean, readable code — especially when paired with groupby(). In 2026, master .agg() with dictionaries, NamedAgg, and custom functions, and you'll write faster, clearer summaries that scale from quick exploration to production reporting.

Next time you need multiple statistics across columns — reach for .agg() first.