Removing Missing Values in Pandas – When and How to Use dropna() 2026

Removing Missing Values in Pandas – When and How to Use dropna() 2026

Removing missing values using dropna() is one of the simplest and fastest ways to clean your dataset. While not always the best strategy, it is often appropriate when missing values are few or when complete cases are required for analysis.

TL;DR — dropna() Parameters

axis=0 → Drop rows (default)
axis=1 → Drop columns
how="any" → Drop if any value is missing (default)
how="all" → Drop only if all values are missing
thresh=n → Keep rows with at least n non-missing values

1. Basic Usage of dropna()

import pandas as pd

df = pd.read_csv("sales_data.csv", parse_dates=["order_date"])

print(f"Original shape: {df.shape}")

# Drop rows with any missing values
df_clean = df.dropna()

print(f"After dropna(): {df_clean.shape}")
print(f"Rows removed: {len(df) - len(df_clean)}")

2. Common dropna() Strategies

# 1. Drop rows only if ALL values are missing
df_clean = df.dropna(how="all")

# 2. Drop columns that have too many missing values
df_clean = df.dropna(thresh=len(df)*0.7, axis=1)   # Keep columns with at least 70% non-missing

# 3. Drop rows based on specific columns only
df_clean = df.dropna(subset=["amount", "region", "customer_id"])

# 4. Drop rows with missing values in critical columns
critical_cols = ["order_date", "amount", "customer_id"]
df_clean = df.dropna(subset=critical_cols)

3. Real-World Example

# Before cleaning
print("Missing values before:")
print(df.isna().sum()[df.isna().sum() > 0])

# Clean strategy: Keep only complete records for key business columns
df_clean = df.dropna(subset=["order_date", "amount", "region", "customer_id"])

print(f"
Rows before: {len(df)}")
print(f"Rows after: {len(df_clean)}")
print(f"Percentage kept: {(len(df_clean)/len(df)*100):.1f}%")

4. Best Practices in 2026

Always check how many rows/columns will be removed before dropping
Use subset to drop based only on important business columns
Use thresh to keep columns that are mostly complete
Consider imputation instead of dropping when missingness is high (>10-20%)
Document your dropping strategy and the percentage of data lost
Never drop rows blindly without understanding the business impact

Conclusion

Removing missing values with dropna() is fast and simple, but it should be used thoughtfully. In 2026, the best practice is to first understand the pattern of missingness, then decide whether to drop rows, drop columns, or use imputation. Use subset and thresh parameters to make your dropping strategy more intelligent and business-aligned.

Next steps:

Analyze the missing values in your current dataset and decide which strategy (drop rows, drop columns, or impute) is most appropriate

Removing Missing Values in Pandas – When and How to Use dropna() 2026

TL;DR — dropna() Parameters

1. Basic Usage of dropna()

2. Common dropna() Strategies

3. Real-World Example

4. Best Practices in 2026

Conclusion

Related Articles in Data Manipulation 2026

Data Manipulation with Pandas & Polars – Complete Guide & Best Practices 2026

Summarizing Dates in Pandas – GroupBy, Resample & Date Features in Python 2026

Slicing the Inner Index Levels Correctly – MultiIndex Best Practices 2026

Generating content...