Converting grouped data into a pivot table is a common and powerful workflow in Pandas — you first use groupby() to aggregate raw data, then reshape the result into a clean, cross-tabulated pivot table for reporting, dashboards, or further analysis. This combination gives you the flexibility of groupby with the readability of pivot tables (similar to Excel, but fully programmable).
In 2026, this pattern remains essential for sales breakdowns, cohort reports, A/B test summaries, and multi-dimensional views. Here’s a practical guide with real examples you can copy and adapt.
1. Basic Setup & Sample Data
import pandas as pd
data = {
'Region': ['North', 'North', 'South', 'South', 'West', 'West'],
'Salesperson': ['Alice', 'Bob', 'Charlie', 'Dave', 'Eve', 'Frank'],
'Sales': [100, 200, 150, 50, 75, 125]
}
df = pd.DataFrame(data)
print(df)
2. Step 1: Groupby Aggregation
First, group by the dimensions you care about and aggregate the metric(s).
# Group by Region and Salesperson ? sum of Sales
grouped = df.groupby(['Region', 'Salesperson'])['Sales'].sum()
print(grouped)
Output (Series with multi-index):
Region Salesperson
North Alice 100
Bob 200
South Charlie 150
Dave 50
West Eve 75
Frank 125
Name: Sales, dtype: int64
3. Step 2: Convert Grouped Result to Pivot Table
Use pivot_table() on the grouped result (or directly on the original df) to reshape it into a clean table.
# Reshape into pivot table
pivot_from_grouped = grouped.unstack() # or use pivot_table directly
print(pivot_from_grouped)
Output (clean pivot with NaN for missing combinations):
Salesperson Alice Bob Charlie Dave Eve Frank
Region
North 100 200 NaN NaN NaN NaN
South NaN NaN 150.0 50.0 NaN NaN
West NaN NaN NaN NaN 75.0 125.0
4. Preferred Way: Use pivot_table() Directly (Simpler & More Flexible)
Skip the intermediate groupby step — pivot_table() does grouping and aggregation internally.
pivot_direct = pd.pivot_table(
df,
values='Sales',
index='Region',
columns='Salesperson',
aggfunc='sum',
fill_value=0, # replace NaN with 0
margins=True # add grand totals
)
print(pivot_direct)
Output (with totals):
Salesperson Alice Bob Charlie Dave Eve Frank All
Region
North 100 200 0 0 0 0 300
South 0 0 150 50 0 0 200
West 0 0 0 0 75 125 200
All 100 200 150 50 75 125 700
5. Advanced: Multiple Aggregations & Multiple Indexes
Group by multiple dimensions and compute multiple metrics.
# Add 'Year' for multi-index example
df['Year'] = [2025, 2025, 2026, 2026, 2025, 2026]
pivot_advanced = pd.pivot_table(
df,
values='Sales',
index=['Region', 'Year'],
columns='Salesperson',
aggfunc=['sum', 'mean'],
fill_value=0,
margins=True
)
print(pivot_advanced)
6. Modern Alternative in 2026: Polars
For large datasets, Polars is often faster and more memory-efficient with a similar pivot API.
import polars as pl
df_pl = pl.DataFrame(data)
pivot_pl = df_pl.pivot(
index="Region",
columns="Salesperson",
values="Sales",
aggregate_function="sum",
sort_columns=True
)
print(pivot_pl)
Best Practices & Common Pitfalls
- Prefer
pivot_table()directly over groupby + unstack — it's simpler and handles missing combinations better - Always specify
aggfunc— default is mean, which surprises many - Use
fill_value=0ordropna=Falseto control missing data display - Add
margins=Truefor grand totals — great for reports - Reset index after pivot if you need Region/Year as regular columns
- For huge data, prefer Polars or chunked processing
- Visualize:
pivot_direct.plot(kind='bar', stacked=True)for instant insights
Conclusion
Grouping by multiple variables and reshaping into pivot tables with pivot_table() turns raw data into clean, multi-dimensional summaries — perfect for cross-tab reports, cohort views, and breakdowns by category. In 2026, use Pandas for readability and flexibility on small-to-medium data, and Polars for speed on large datasets. Master values, index, columns, aggfunc, margins, and fill_value, and you'll build powerful reports in minutes.
Next time you need to break down metrics by multiple dimensions — reach for pivot_table first.