Handling Missing Values in Pandas – Best Practices 2026
Missing values (NaN) are very common in real-world datasets. Knowing how to detect, analyze, and handle them properly is a fundamental skill in data manipulation. In 2026, Pandas provides several powerful methods to deal with missing data effectively.
TL;DR — Key Strategies
df.isna().sum()– Count missing valuesdf.dropna()– Remove rows/columns with missing valuesdf.fillna()– Fill missing values with specific values- Smart imputation based on context
1. Detecting Missing Values
import pandas as pd
df = pd.read_csv("sales_data.csv", parse_dates=["order_date"])
# Count missing values per column
missing = df.isna().sum()
print(missing)
# Percentage of missing values
missing_pct = df.isna().mean() * 100
print(missing_pct.round(2))
2. Removing Missing Values
# Drop rows with any missing values
df_clean = df.dropna()
# Drop rows only if all values are missing
df_clean = df.dropna(how="all")
# Drop columns with more than 30% missing values
threshold = len(df) * 0.7
df_clean = df.dropna(thresh=threshold, axis=1)
3. Filling Missing Values (Imputation)
# Fill with constant
df["amount"] = df["amount"].fillna(0)
# Fill with mean/median (column-wise)
df["amount"] = df["amount"].fillna(df["amount"].mean())
# Forward fill (useful for time series)
df["amount"] = df["amount"].fillna(method="ffill")
# Fill with group mean (smart imputation)
df["amount"] = df.groupby("region")["amount"].transform(lambda x: x.fillna(x.mean()))
4. Best Practices in 2026
- First, understand why the data is missing (MCAR, MAR, MNAR)
- Document your imputation strategy
- Use
fillna(0)for counts and amounts when missing means "zero" - Use group-wise imputation (
groupby().transform()) when missingness depends on categories - Consider using advanced imputation libraries like
sklearn.imputefor complex cases - Always check the impact of your handling strategy on downstream analysis
Conclusion
Handling missing values correctly is crucial for accurate data analysis. In 2026, the best approach is to first explore the pattern of missingness, then choose the most appropriate method — whether removing, filling with constants, or using smart group-based imputation. Always document your decisions so others can understand and reproduce your work.
Next steps:
- Analyze the missing values in one of your datasets using
isna().sum()and decide on the best handling strategy for each column