Adding and Extending Python Dictionaries: Flexible Data Manipulation for Data Science 2026
Adding and extending dictionaries is one of the most common and powerful operations in data science. Whether you are merging model configurations, adding new features to a metadata dict, combining summary statistics, or dynamically building hyperparameter grids, Python gives you several clean and efficient ways to extend dictionaries without writing verbose code.
TL;DR — Modern Ways to Add/Extend Dictionaries
.update()→ in-place extension{**dict1, **dict2}→ create new merged dict (very Pythonic)dict1 | dict2→ union operator (Python 3.9+).setdefault()→ lazy initialization
1. Basic Adding and Extending
# Starting dictionary
config = {"n_estimators": 100, "max_depth": 10}
# 1. .update() - modifies in place
config.update({"learning_rate": 0.1, "random_state": 42})
# 2. Unpacking (most Pythonic way)
default_config = {"n_estimators": 100, "max_depth": 10}
user_config = {"n_estimators": 300, "learning_rate": 0.05}
final_config = {**default_config, **user_config} # user wins on conflicts
# 3. Union operator | (clean and modern)
merged = default_config | user_config
2. Real-World Data Science Examples
import pandas as pd
df = pd.read_csv("sales_data.csv")
# Example 1: Building feature metadata dictionary
feature_metadata = {}
for col in df.columns:
feature_metadata.setdefault(col, {})
feature_metadata[col].update({
"dtype": str(df[col].dtype),
"unique_count": df[col].nunique(),
"has_missing": df[col].isna().any()
})
# Example 2: Merging model configurations safely
base_params = {"model_type": "random_forest", "n_estimators": 100}
user_params = {"n_estimators": 300, "max_depth": 15}
final_params = {**base_params, **user_params}
# Example 3: Extending summary statistics
summary = {"total_sales": 0, "count": 0}
for row in df.itertuples():
summary["total_sales"] += row.amount
summary["count"] += 1
summary.setdefault("avg_amount", 0) # lazy init
3. Advanced Extension Techniques
# Using .setdefault() for nested structures
nested = {}
nested.setdefault("model", {}).setdefault("params", {})["n_estimators"] = 200
# ChainMap for layered configuration (advanced)
from collections import ChainMap
defaults = {"n_estimators": 100}
user = {"n_estimators": 300}
final = ChainMap(user, defaults)
4. Best Practices in 2026
- Use
{**dict1, **dict2}or|for clean, readable merging - Use
.update()when you want to modify a dict in place - Use
.setdefault()for lazy initialization of nested structures - Prefer unpacking over manual loops when combining multiple dictionaries
- Keep configuration and metadata dictionaries immutable when possible (use
frozendictorMappingProxyType)
Conclusion
Adding and extending dictionaries is a daily task in data science. In 2026, the most Pythonic approaches are dictionary unpacking (**), the union operator (|), and .setdefault(). These techniques make your model configuration, feature metadata, and summary statistics code clean, concise, and maintainable while avoiding verbose boilerplate.
Next steps:
- Review your current code and replace manual dictionary merging loops with modern unpacking or union operations