Cumulative statistics — running totals, max/min, products — are essential for understanding how data accumulates over time or rows. In Pandas, the family of .cumsum(), .cummax(), .cummin(), and .cumprod() methods make these calculations simple and efficient, turning raw sequences into powerful insights for tracking growth, balances, peaks, or multiplicative effects.
In 2026, these methods are still core to time-series analysis, financial reporting, inventory tracking, and progressive metrics — especially when combined with groupby() for per-group running totals.
1. The Core Cumulative Methods
.cumsum()— running sum (most common).cummax()— running maximum (track all-time highs).cummin()— running minimum (track all-time lows).cumprod()— running product (e.g., compound growth rates)
2. Basic Example: Cumulative Sum
Track how sales (or any metric) accumulate over time.
import pandas as pd
data = {
'Year': [2010, 2011, 2012, 2013, 2014],
'Sales': [10000, 12000, 15000, 18000, 20000]
}
df = pd.DataFrame(data)
# Cumulative sum
df['Cumulative Sales'] = df['Sales'].cumsum()
print(df)
Output:
Year Sales Cumulative Sales
0 2010 10000 10000
1 2011 12000 22000
2 2012 15000 37000
3 2013 18000 55000
4 2014 20000 75000
3. Running Max & Min (Track Peaks & Floors)
Useful for monitoring highest/lowest values seen so far.
df['Cumulative Max Sales'] = df['Sales'].cummax()
df['Cumulative Min Sales'] = df['Sales'].cummin()
print(df[['Year', 'Sales', 'Cumulative Max Sales', 'Cumulative Min Sales']])
Output:
Year Sales Cumulative Max Sales Cumulative Min Sales
0 2010 10000 10000 10000
1 2011 12000 12000 10000
2 2012 15000 15000 10000
3 2013 18000 18000 10000
4 2014 20000 20000 10000
4. Cumulative Product (Compound Growth)
Great for rates of return, multipliers, or exponential growth.
# Example: monthly growth factors
growth = pd.Series([1.05, 1.03, 1.07, 1.02, 1.04]) # 5%, 3%, 7%, 2%, 4%
df['Cumulative Growth'] = growth.cumprod()
print(df['Cumulative Growth'])
# 1.0000
# 1.0500
# 1.0815
# 1.1572
# 1.1803
# Name: Cumulative Growth, dtype: float64
5. Cumulative Stats with Groupby (Per-Group Running Totals)
Very common in real analysis: cumulative sales per customer, per product, etc.
# Add customer column
df['Customer'] = ['A', 'B', 'A', 'C', 'B']
# Cumulative sales per customer
df = df.sort_values(['Customer', 'Year'])
df['Cum Sales per Customer'] = df.groupby('Customer')['Sales'].cumsum()
print(df[['Customer', 'Year', 'Sales', 'Cum Sales per Customer']])
6. Modern Alternative in 2026: Polars
For large datasets, Polars is often faster and more memory-efficient.
import polars as pl
df_pl = pl.DataFrame(data)
df_pl = df_pl.with_columns(
pl.col("Sales").cumsum().alias("Cumulative Sales"),
pl.col("Sales").cummax().alias("Cumulative Max Sales")
)
print(df_pl)
7. Best Practices & Common Pitfalls
- Sort data first if order matters (e.g., time-series):
df = df.sort_values('date') - Handle NaN early:
.cumsum(skipna=True)or.fillna(0) - Use
axis=1for row-wise cumsum (rare but useful for wide data) - For huge data, switch to Polars or chunked processing
- Visualize:
df['Cumulative Sales'].plot()to see trends instantly
Conclusion
Cumulative statistics — sum, max, min, product — turn sequential data into running stories of growth, peaks, floors, and multiplication effects. In 2026, use Pandas .cumsum() (and family) for quick insights on small-to-medium data, and Polars for speed on large datasets.
Next time you’re tracking totals, balances, highs/lows, or compound growth — reach for these cumulative methods — they’re simple but reveal powerful patterns.