Multiple grouped summaries — applying different aggregations to different columns across one or more grouping variables — is one of the most powerful and frequently used patterns in data analysis. In Pandas, you achieve this by chaining groupby() with .agg() and passing a dictionary where keys are column names and values are single functions or lists of functions.
In 2026, this technique powers dashboards, cohort analysis, financial breakdowns, A/B test results, and any multi-dimensional reporting. Here’s a practical guide with real examples you can copy and adapt immediately.
1. Basic Setup & Sample Data
import pandas as pd
data = {
'Group1': ['A', 'B', 'C', 'A', 'B', 'C'],
'Group2': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],
'Value1': [1, 2, 3, 4, 5, 6],
'Value2': [10, 20, 30, 40, 50, 60],
'Sales': [100, 200, 150, 300, 250, 400]
}
df = pd.DataFrame(data)
print(df.head())
2. Group by Multiple Columns + Different Aggregations per Column
Use a dictionary inside .agg(): keys = column names, values = function (or list of functions).
multi_grouped = df.groupby(['Group1', 'Group2']).agg({
'Value1': ['mean', 'sum'], # multiple stats on Value1
'Value2': 'mean', # single stat on Value2
'Sales': ['sum', 'mean'] # multiple stats on Sales
})
print(multi_grouped)
Output (multi-level columns — very common in reports):
Value1 Value2 Sales
mean sum mean sum mean
Group1 Group2
A X 1.0 1 10.0 100.0 50.0
Y 4.0 4 40.0 300.0 150.0
B X 2.0 2 20.0 200.0 100.0
Z 5.0 5 50.0 250.0 125.0
C Y 3.0 3 30.0 150.0 75.0
Z 6.0 6 60.0 400.0 200.0
3. Clean Output with Named Aggregations
Use NamedAgg or alias syntax for flat, readable column names (highly recommended for reports).
from pandas import NamedAgg
named_summary = df.groupby(['Group1', 'Group2']).agg(
avg_value1=('Value1', 'mean'),
total_value1=('Value1', 'sum'),
avg_value2=('Value2', 'mean'),
grand_total_sales=('Sales', 'sum')
)
print(named_summary)
Output (clean & flat — perfect for dashboards):
avg_value1 total_value1 avg_value2 grand_total_sales
Group1 Group2
A X 1.0 1 10.0 100
Y 4.0 4 40.0 300
B X 2.0 2 20.0 200
Z 5.0 5 50.0 250
C Y 3.0 3 30.0 150
Z 6.0 6 60.0 400
4. Even More Aggregations (Real-World Reporting Example)
Combine many metrics — count, mean, sum, min/max — across groups.
report = df.groupby(['Group1', 'Group2']).agg({
'Value1': ['count', 'mean', 'sum', 'min', 'max'],
'Value2': 'mean',
'Sales': ['sum', 'mean']
})
# Optional: flatten column names
report.columns = ['_'.join(col).strip() for col in report.columns.values]
print(report)
5. Modern Alternative in 2026: Polars
For large datasets, Polars is often faster and more memory-efficient with similar syntax.
import polars as pl
df_pl = pl.DataFrame(data)
grouped_pl = df_pl.group_by(["Group1", "Group2"]).agg(
avg_value1=pl.col("Value1").mean(),
total_value1=pl.col("Value1").sum(),
avg_value2=pl.col("Value2").mean(),
total_sales=pl.col("Sales").sum()
)
print(grouped_pl)
Best Practices & Common Pitfalls
- Use dictionary syntax for different functions per column — much clearer than lists
- Handle missing data before aggregation (
fillnaordropna) - Reset index after groupby if you want grouping columns as regular:
.reset_index() - For huge data, prefer Polars or chunked processing
- Visualize results:
grouped_multi.plot(kind='bar')for instant insights
Conclusion
Grouping by multiple variables + applying multiple summaries per column with groupby() + .agg() turns raw data into rich, multi-dimensional insights — averages, totals, counts, min/max across subgroups. In 2026, use Pandas for readability on small-to-medium data, and Polars for speed on large datasets. Master dictionary aggregations, NamedAgg, and custom functions, and you'll write concise, powerful summaries that scale from exploration to production reporting.
Next time you need cross-dimensional metrics — reach for multi-column groupby + agg first.