Pivot on two variables

Pivoting on two variables (or more) is one of the most powerful ways to reshape and summarize data in Pandas — it lets you create multi-level row and column groupings (e.g., sales by region and month, users by cohort and device) in a clean, cross-tabulated view. This is perfect for time-series breakdowns, cohort analysis, geographic reports, and any multi-dimensional exploration.

In 2026, this pattern remains essential for dashboards and interactive reporting. Here’s a practical guide with real examples you can copy and adapt.

1. Basic Setup & Sample Data


import pandas as pd

data = {
    'Region': ['North', 'North', 'South', 'South', 'West', 'West'],
    'Month': ['Jan', 'Feb', 'Jan', 'Feb', 'Jan', 'Feb'],
    'Salesperson': ['Alice', 'Bob', 'Charlie', 'Dave', 'Eve', 'Frank'],
    'Sales': [100, 200, 150, 50, 75, 125],
    'Profit': [20, 50, 30, 10, 15, 25]
}

df = pd.DataFrame(data)
print(df)

2. Pivot on Two Variables: Region × Month

Use a list for index (rows) to create a multi-level row index, and a single column for columns.


pivot_two_vars = pd.pivot_table(
    df,
    values=['Sales', 'Profit'],
    index=['Region', 'Month'],          # two variables for rows
    columns='Salesperson',
    aggfunc='sum',
    fill_value=0,                       # replace NaN with 0
    margins=True                        # add grand totals
)

print(pivot_two_vars)

Output (multi-level index by Region and Month, clean totals):


                       Profit               Sales             
                        sum                       sum        
Salesperson           Alice  Bob Charlie  Dave  Eve Frank   Alice   Bob Charlie  Dave   Eve Frank   All
Region Month                                                                                          
North  Feb             50.0  0.0     0.0   0.0  0.0   0.0   200.0   0.0     0.0   0.0   0.0   0.0   200
       Jan             20.0  0.0     0.0   0.0  0.0   0.0   100.0   0.0     0.0   0.0   0.0   0.0   100
South  Feb              0.0  0.0    10.0   0.0  0.0   0.0     0.0   0.0    50.0   0.0   0.0   0.0    50
       Jan              0.0  0.0    30.0   0.0  0.0   0.0     0.0   0.0   150.0   0.0   0.0   0.0   150
West   Feb              0.0  0.0     0.0   0.0 25.0  0.0     0.0   0.0     0.0   0.0 125.0   0.0   125
       Jan              0.0  0.0     0.0   0.0 15.0  0.0     0.0   0.0     0.0   0.0  75.0   0.0    75
All                     70.0 50.0    40.0  10.0 40.0  25.0   300.0 200.0   200.0  50.0 200.0 125.0   700

3. Pivot on Two Variables + Different Aggregations

Use a dictionary in aggfunc for different stats per metric (e.g., sum sales, mean profit).


pivot_diff_agg = pd.pivot_table(
    df,
    values=['Sales', 'Profit'],
    index=['Region', 'Month'],
    columns='Salesperson',
    aggfunc={'Sales': 'sum', 'Profit': 'mean'},
    fill_value=0,
    margins=True
)

print(pivot_diff_agg)

4. Real-World Example: Sales & Profit by Region × Month × Salesperson

Full multi-dimensional pivot with totals and different stats.


pivot_full = pd.pivot_table(
    df,
    values=['Sales', 'Profit'],
    index=['Region', 'Month'],
    columns='Salesperson',
    aggfunc=['sum', 'mean'],
    fill_value=0,
    margins=True
)

print(pivot_full)

5. Modern Alternative in 2026: Polars

For large datasets, Polars is often faster and more memory-efficient with a similar pivot API.


import polars as pl

df_pl = pl.DataFrame(data)
pivot_pl = df_pl.pivot(
    index=["Region", "Month"],
    columns="Salesperson",
    values=["Sales", "Profit"],
    aggregate_function="sum"
)
print(pivot_pl)

Best Practices & Common Pitfalls

Use list for index or columns when pivoting on multiple variables — creates multi-level hierarchy
Always specify aggfunc — default is mean, which surprises many
Use fill_value=0 or dropna=False to control missing combinations
Add margins=True for grand totals — great for reports
Reset index after pivot if you need Region/Month as regular columns
For huge data, prefer Polars or chunked processing
Visualize: pivot_two_vars.plot(kind='bar', stacked=True) for instant insights

Conclusion

Pivoting on two (or more) variables with pivot_table() creates powerful multi-dimensional summaries — totals, averages, counts by region and month, cohort and channel, etc. In 2026, use Pandas for readability and flexibility on small-to-medium data, and Polars for speed on large datasets. Master index lists, columns, values, aggfunc, margins, and fill_value, and you'll build insightful cross-tab reports in minutes.

Next time you need breakdowns across two dimensions — reach for pivot_table with multi-index first.