Summary Statistics in Pandas – describe(), agg(), and More in Python 2026
Getting quick and meaningful summary statistics is one of the first steps in any data analysis or manipulation task. In 2026, Pandas provides powerful and flexible ways to compute summary statistics using describe(), agg(), and groupby operations.
TL;DR — Key Methods
df.describe()– Quick statistical summary for numeric columnsdf.agg()– Custom aggregations with full controldf.groupby().agg()– Group-wise statisticsdf.value_counts()anddf.describe(include="all")for categorical data
1. Basic Summary with describe()
import pandas as pd
df = pd.read_csv("sales_data.csv", parse_dates=["order_date"])
print(df.describe()) # Numeric columns only
print("
Full summary:")
print(df.describe(include="all")) # Includes categorical columns
2. Custom Aggregations with agg()
summary = df.agg({
"amount": ["count", "mean", "median", "min", "max", "std"],
"quantity": ["sum", "mean"],
"customer_id": ["nunique"]
}).round(2)
print(summary)
3. Grouped Summary Statistics
grouped = (
df
.groupby(["region", "year"])
.agg({
"amount": ["sum", "mean", "count"],
"customer_id": "nunique",
"order_date": ["min", "max"]
})
.round(2)
)
print(grouped)
4. Best Practices in 2026
- Use
describe(include="all")to get a quick overview of both numeric and categorical data - Prefer
.agg()over multiple separate calls for better performance - Combine with
groupby()for segmented analysis - Use
.round()to keep output clean - For very large datasets, consider Dask equivalent methods
Conclusion
Mastering summary statistics with describe() and agg() is essential for exploratory data analysis and reporting. In 2026, using these methods efficiently helps you quickly understand your data, spot outliers, and make informed decisions before diving deeper into manipulation or modeling.
Next steps:
- Run
df.describe(include="all")on your next dataset and explore the insights it reveals