Returning Functions from Functions – Closures & Factories in Data Science 2026
One of the most powerful features in Python is the ability for a function to return another function. This pattern, known as **function factories** or **closures**, is extremely useful in data science for creating customized transformers, filters, scorers, and reusable data processing pipelines.
TL;DR — Why Return Functions?
- To create reusable, configurable functions
- To implement the "factory" pattern
- To build closures that remember specific settings
- To create decorators and higher-order functions
1. Basic Function Factory Example
def create_price_filter(threshold: float):
"""Returns a function that checks if price > threshold."""
def is_expensive(price: float) -> bool:
return price > threshold # 'threshold' is remembered (closure)
return is_expensive
# Usage
expensive_filter = create_price_filter(1500)
cheap_filter = create_price_filter(500)
df["is_expensive"] = df["amount"].apply(expensive_filter)
df["is_cheap"] = df["amount"].apply(cheap_filter)
2. Real-World Data Science Example
def create_feature_engineer(scaling_method: str = "standard"):
"""Factory that returns a customized feature engineering function."""
def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
df = df.copy()
if scaling_method == "standard":
df["amount_scaled"] = (df["amount"] - df["amount"].mean()) / df["amount"].std()
elif scaling_method == "minmax":
df["amount_scaled"] = (df["amount"] - df["amount"].min()) / (df["amount"].max() - df["amount"].min())
df["log_amount"] = df["amount"].apply(lambda x: 0 if x <= 0 else np.log1p(x))
return df
return engineer_features
# Create different feature engineering pipelines
standard_engineer = create_feature_engineer("standard")
minmax_engineer = create_feature_engineer("minmax")
# Use them
df_standard = standard_engineer(df)
df_minmax = minmax_engineer(df)
3. Best Practices in 2026
- Use function factories when you need to create multiple similar functions with different configurations
- Document what the returned function does and what it "remembers"
- Keep the inner function focused and small
- Use closures for creating reusable data transformers and filters
- Be aware that closures capture variables from the enclosing scope
Conclusion
Returning functions from functions (function factories and closures) is a sophisticated but very useful technique in data science. In 2026, this pattern is commonly used to create configurable data preprocessors, custom filters, feature engineers, and model wrappers. When used correctly, it leads to cleaner, more reusable, and more maintainable code.
Next steps:
- Identify repetitive data processing logic in your code and refactor it into a function factory that returns customized functions