Cumulative Statistics in Pandas – cumsum, cummax, cummin, expanding() & More in Python 2026
Cumulative statistics allow you to calculate running totals, running maximums, running averages, and other metrics that update as you move through your data. These are extremely valuable for trend analysis, growth tracking, and creating powerful features in data manipulation workflows.
TL;DR — Key Cumulative Functions
.cumsum()– Running total.cummax()– Running maximum.cummin()– Running minimum.cumprod()– Running product.expanding().mean()– Running (expanding window) average.expanding().std()– Running standard deviation
1. Basic Cumulative Statistics
import pandas as pd
df = pd.read_csv("sales_data.csv", parse_dates=["order_date"])
df = df.sort_values("order_date")
# Running totals
df["cum_sales"] = df["amount"].cumsum()
df["cum_quantity"] = df["quantity"].cumsum()
# Running maximum and minimum
df["running_max_sale"] = df["amount"].cummax()
df["running_min_sale"] = df["amount"].cummin()
2. Running Averages and Statistics with expanding()
# Running (expanding window) statistics
df["running_avg_sale"] = df["amount"].expanding().mean()
df["running_std_sale"] = df["amount"].expanding().std()
df["running_min"] = df["amount"].expanding().min()
3. Grouped Cumulative Statistics (Most Powerful Pattern)
# Cumulative sales per region
df["cum_sales_by_region"] = df.groupby("region")["amount"].cumsum()
# Running average per customer
df["customer_running_avg"] = df.groupby("customer_id")["amount"].expanding().mean().reset_index(level=0, drop=True)
4. Real-World Example: Monthly Cumulative Growth
monthly = (
df
.groupby([df["order_date"].dt.to_period("M"), "region"])
.agg(total_sales=("amount", "sum"))
.reset_index()
)
monthly["cumulative_sales"] = monthly.groupby("region")["total_sales"].cumsum()
monthly["cumulative_growth"] = monthly.groupby("region")["cumulative_sales"].pct_change()
Best Practices in 2026
- Always sort your data by the time column before applying cumulative functions
- Use
groupby()+ cumulative methods for segmented running statistics - Combine
cumsum()withexpanding()for rich feature engineering - Use
.fillna(0)if your data contains NaNs before cumulative calculations - These operations are very fast in Pandas and scale well to large datasets
Conclusion
Cumulative statistics are a powerful addition to any data manipulation toolkit. In 2026, combining cumsum(), cummax(), expanding(), and groupby() allows you to create insightful running metrics, growth trends, and historical comparisons with minimal code. These techniques turn static transactional data into dynamic, time-aware insights.
Next steps:
- Add cumulative sales and running average columns to one of your current datasets and explore the new insights they reveal