Working with Nested Dictionaries in Python: Exploring Hierarchical Structures for Data Science 2026
Nested dictionaries (dictionaries inside dictionaries) are one of the most powerful and commonly used data structures in modern data science. They naturally represent hierarchical data such as model configurations, JSON API responses, grouped summary statistics, feature metadata, and tree-like structures. Mastering nested dicts lets you handle complex, real-world data with clean, readable, and efficient code.
TL;DR — Essential Techniques
- Create nested dicts with literals, comprehensions, or
defaultdict - Access safely with a helper function or
.get()chaining - Modify and loop through hierarchies with recursion or
.items() - Use
frozendictortypes.MappingProxyTypefor immutable nested configs
1. Creating Nested Dictionaries
# 1. Literal nested dict
model_config = {
"model_type": "random_forest",
"params": {
"n_estimators": 200,
"max_depth": 10,
"random_state": 42
},
"metadata": {
"version": "2026.1",
"author": "Data Team"
}
}
# 2. Dict comprehension for hierarchical data
summary = {
region: {
"total_sales": group["amount"].sum(),
"avg_amount": round(group["amount"].mean(), 2),
"count": len(group)
}
for region, group in df.groupby("region")
}
2. Safe Access to Nested Data
def safe_get(d, *keys, default=None):
"""Safely navigate nested dictionaries."""
for key in keys:
if isinstance(d, dict):
d = d.get(key, default)
else:
return default
return d
# Usage
n_est = safe_get(model_config, "params", "n_estimators", default=100)
author = safe_get(model_config, "metadata", "author", default="Unknown")
3. Real-World Data Science Examples
import pandas as pd
from collections import defaultdict
df = pd.read_csv("sales_data.csv")
# Example 1: Hierarchical summary by region and category
hierarchical_summary = defaultdict(lambda: defaultdict(dict))
for row in df.itertuples():
hierarchical_summary[row.region][row.category] = {
"total": hierarchical_summary[row.region][row.category].get("total", 0) + row.amount,
"count": hierarchical_summary[row.region][row.category].get("count", 0) + 1
}
# Example 2: Model hyperparameter grid as nested dict
grid = {
"random_forest": {
"n_estimators": [100, 200, 300],
"max_depth": [None, 10, 20]
},
"xgboost": {
"n_estimators": [100, 200],
"learning_rate": [0.01, 0.1, 0.3]
}
}
4. Looping Through Nested Structures
# Recursive or flat iteration
for region, categories in hierarchical_summary.items():
for category, stats in categories.items():
print(f"{region} - {category}: total=${stats['total']:,.2f}, count={stats['count']}")
5. Best Practices in 2026
- Use a
safe_gethelper for all deep nested access - Prefer
defaultdictfor building hierarchical data dynamically - Keep configuration and metadata as nested dicts for clarity
- Use
frozendictorMappingProxyTypefor immutable nested configs - Consider Pydantic or dataclasses for complex nested structures in production
Conclusion
Nested dictionaries are the natural way to represent hierarchical data in Python data science. In 2026, safe creation, access, modification, and iteration over nested structures are essential skills. Combine literal syntax, dict comprehensions, safe_get, and defaultdict to build clean, robust, and maintainable pipelines that handle real-world complex data with ease.
Next steps:
- Review any nested configuration or summary code in your projects and apply the safe access and
defaultdictpatterns shown above