Pivot tables are one of the most powerful tools in data analysis — they let you reshape, summarize, and cross-tabulate data in seconds, turning long raw datasets into concise, multi-dimensional views (just like Excel pivot tables, but programmatic and reproducible). In Pandas, the pivot_table() function is the go-to method for this, offering flexibility that basic pivot() lacks.
In 2026, pivot tables remain essential for dashboards, cohort analysis, sales breakdowns, A/B test results, and any time you need to see metrics across categories. Here’s a practical guide with real examples you can copy and adapt.
1. Basic Setup & Sample Data
import pandas as pd
data = {
'Region': ['North', 'North', 'South', 'South', 'West', 'West'],
'Salesperson': ['Alice', 'Bob', 'Charlie', 'Dave', 'Eve', 'Frank'],
'Sales': [100, 200, 150, 50, 75, 125]
}
df = pd.DataFrame(data)
print(df)
2. Simple Pivot Table: One Index, One Column, One Value
Summarize sales by region and salesperson — classic use case.
pivot_basic = pd.pivot_table(
df,
values='Sales', # what to summarize
index='Region', # rows
columns='Salesperson', # columns
aggfunc='sum' # aggregation function (default is mean)
)
print(pivot_basic)
Output (NaN where no data exists):
Salesperson Alice Bob Charlie Dave Eve Frank
Region
North 100 200 NaN NaN NaN NaN
South NaN NaN 150.0 50.0 NaN NaN
West NaN NaN NaN NaN 75.0 125.0
3. Multiple Aggregations & Multiple Values
Calculate sum, mean, count — or use different functions for different metrics.
pivot_multi = pd.pivot_table(
df,
values=['Sales'],
index='Region',
columns='Salesperson',
aggfunc=['sum', 'mean', 'count'],
fill_value=0, # replace NaN with 0
margins=True # add grand totals
)
print(pivot_multi)
4. Group by Multiple Indexes + Custom Aggregations
Add another dimension (e.g., year or product) and mix functions per column.
# Add a 'Year' column for realism
df['Year'] = [2025, 2025, 2026, 2026, 2025, 2026]
pivot_advanced = pd.pivot_table(
df,
values='Sales',
index=['Region', 'Year'],
columns='Salesperson',
aggfunc={'Sales': ['sum', 'mean']},
fill_value=0,
margins=True
)
print(pivot_advanced)
5. Modern Alternative in 2026: Polars
For large datasets, Polars is often faster and more memory-efficient with similar syntax.
import polars as pl
df_pl = pl.DataFrame(data)
pivot_pl = df_pl.pivot(
index="Region",
columns="Salesperson",
values="Sales",
aggregate_function="sum",
sort_columns=True
)
print(pivot_pl)
Best Practices & Common Pitfalls
- Always specify
aggfunc— default is mean, which surprises many - Use
fill_value=0ordropna=Falseto handle missing combinations - Add
margins=Truefor grand totals — great for reports - Convert index/columns back to regular columns with
.reset_index()if needed - For huge data, prefer Polars or chunked processing
- Visualize:
pivot_basic.plot(kind='bar', stacked=True)for instant insights
Conclusion
Pivot tables with pivot_table() turn long, raw data into concise, cross-tabulated summaries — perfect for multi-dimensional analysis. In 2026, use Pandas for readability and flexibility on small-to-medium data, and Polars for speed on large datasets. Master index, columns, values, aggfunc, margins, and fill_value, and you'll build powerful reports in minutes.
Next time you need to break down metrics by category and subcategory — reach for pivot_table first.