Many groups, many summaries

Multiple grouped summaries — applying different aggregations to different columns across one or more grouping variables — is one of the most powerful and frequently used patterns in data analysis. In Pandas, you achieve this by chaining groupby() with .agg() and passing a dictionary where keys are column names and values are single functions or lists of functions.

In 2026, this technique powers dashboards, cohort analysis, financial breakdowns, A/B test results, and any multi-dimensional reporting. Here’s a practical guide with real examples you can copy and adapt immediately.

1. Basic Setup & Sample Data


import pandas as pd

data = {
    'Group1': ['A', 'B', 'C', 'A', 'B', 'C'],
    'Group2': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],
    'Value1': [1, 2, 3, 4, 5, 6],
    'Value2': [10, 20, 30, 40, 50, 60],
    'Sales': [100, 200, 150, 300, 250, 400]
}

df = pd.DataFrame(data)
print(df.head())

2. Group by Multiple Columns + Different Aggregations per Column

Use a dictionary inside .agg(): keys = column names, values = function (or list of functions).


multi_grouped = df.groupby(['Group1', 'Group2']).agg({
    'Value1': ['mean', 'sum'],    # multiple stats on Value1
    'Value2': 'mean',             # single stat on Value2
    'Sales': ['sum', 'mean']      # multiple stats on Sales
})

print(multi_grouped)

Output (multi-level columns — very common in reports):


               Value1       Value2     Sales      
                 mean  sum     mean     sum   mean
Group1 Group2                                      
A      X         1.0    1     10.0   100.0   50.0
       Y         4.0    4     40.0   300.0  150.0
B      X         2.0    2     20.0   200.0  100.0
       Z         5.0    5     50.0   250.0  125.0
C      Y         3.0    3     30.0   150.0   75.0
       Z         6.0    6     60.0   400.0  200.0

3. Clean Output with Named Aggregations

Use NamedAgg or alias syntax for flat, readable column names (highly recommended for reports).


from pandas import NamedAgg

named_summary = df.groupby(['Group1', 'Group2']).agg(
    avg_value1=('Value1', 'mean'),
    total_value1=('Value1', 'sum'),
    avg_value2=('Value2', 'mean'),
    grand_total_sales=('Sales', 'sum')
)

print(named_summary)

Output (clean & flat — perfect for dashboards):


                 avg_value1  total_value1  avg_value2  grand_total_sales
Group1 Group2                                                           
A      X               1.0             1        10.0                100
       Y               4.0             4        40.0                300
B      X               2.0             2        20.0                200
       Z               5.0             5        50.0                250
C      Y               3.0             3        30.0                150
       Z               6.0             6        60.0                400

4. Even More Aggregations (Real-World Reporting Example)

Combine many metrics — count, mean, sum, min/max — across groups.


report = df.groupby(['Group1', 'Group2']).agg({
    'Value1': ['count', 'mean', 'sum', 'min', 'max'],
    'Value2': 'mean',
    'Sales': ['sum', 'mean']
})

# Optional: flatten column names
report.columns = ['_'.join(col).strip() for col in report.columns.values]

print(report)

5. Modern Alternative in 2026: Polars

For large datasets, Polars is often faster and more memory-efficient with similar syntax.


import polars as pl

df_pl = pl.DataFrame(data)
grouped_pl = df_pl.group_by(["Group1", "Group2"]).agg(
    avg_value1=pl.col("Value1").mean(),
    total_value1=pl.col("Value1").sum(),
    avg_value2=pl.col("Value2").mean(),
    total_sales=pl.col("Sales").sum()
)
print(grouped_pl)

Best Practices & Common Pitfalls

Use dictionary syntax for different functions per column — much clearer than lists
Handle missing data before aggregation (fillna or dropna)
Reset index after groupby if you want grouping columns as regular: .reset_index()
For huge data, prefer Polars or chunked processing
Visualize results: grouped_multi.plot(kind='bar') for instant insights

Conclusion

Grouping by multiple variables + applying multiple summaries per column with groupby() + .agg() turns raw data into rich, multi-dimensional insights — averages, totals, counts, min/max across subgroups. In 2026, use Pandas for readability on small-to-medium data, and Polars for speed on large datasets. Master dictionary aggregations, NamedAgg, and custom functions, and you'll write concise, powerful summaries that scale from exploration to production reporting.

Next time you need cross-dimensional metrics — reach for multi-column groupby + agg first.