Replacing substrings

Replacing substrings is one of the most practical and frequently used string operations in Python — the replace() method finds all (or a limited number of) occurrences of a target substring and substitutes them with a new string, returning a modified copy (strings are immutable). It’s fast, case-sensitive by default, and ideal for cleaning text, normalizing data, redacting sensitive information, templating, standardizing formats, and preparing strings for parsing or analysis. In 2026, replace() remains a cornerstone — especially in data pipelines, log processing, user input sanitization, and vectorized column operations in pandas/Polars where it scales efficiently to millions of rows.

Here’s a complete, practical guide to replacing substrings in Python: basic replace() usage, count-limited replacement, chaining, real-world patterns, performance considerations, and modern best practices with type hints, regex alternatives, pandas/Polars vectorization, and safety.

The core method replace(old, new) replaces every occurrence — simple, case-sensitive, and returns a new string.


text = "This is a sentence with the word 'apple' in it."

updated = text.replace("apple", "orange")
print(updated)
# This is a sentence with the word 'orange' in it.

Handle multiple occurrences — replace() swaps all by default; optional count limits how many replacements occur (from left to right).


text = "apple apple apple banana apple"

# Replace all
print(text.replace("apple", "orange"))
# orange orange orange banana orange

# Replace only first 2
print(text.replace("apple", "orange", 2))
# orange orange apple banana apple

Real-world pattern: cleaning and normalizing text columns in pandas — vectorized .str.replace() replaces patterns across entire Series efficiently.


import pandas as pd

df = pd.DataFrame({
    'text': [
        "Error: connection failed",
        "apple and orange are fruits",
        "Buy apple 3 times"
    ]
})

# Vectorized replace
df['clean'] = df['text'].str.replace("apple", "orange", regex=False)
df['no_error'] = df['text'].str.replace(r"^Error: ", "", regex=True)

print(df)
#                           text                              clean                         no_error
# 0     Error: connection failed          orange not found              apple not found
# 1  apple and orange are fruits  orange and orange are fruits  apple and orange are fruits
# 2             Buy apple 3 times             Buy orange 3 times             Buy apple 3 times

Best practices make replacing substrings fast, safe, and readable. Use count when you want limited replacements — prevents over-replacing in large text. Prefer str.replace() in pandas/Polars — vectorized and much faster than apply(lambda x: x.replace(...)). Modern tip: use Polars for large text columns — pl.col("text").str.replace("apple", "orange") is 10–100× faster than pandas .str.replace(). Add type hints — str or pd.Series[str] — improves static analysis. For complex patterns, use re.sub() — re.sub(r'\bapple\b', 'orange', text) for word boundaries. Case-insensitive replace — text.lower().replace("apple", "orange") or regex with re.IGNORECASE. Chain methods — text.strip().replace("old", "new").lower() — cleans + replaces + normalizes in one go. Handle empty strings — replace() on empty returns empty safely. Avoid repeated replace() calls — use str.maketrans() + translate() for multiple char replacements (faster). Combine with split()/join() — ' '.join(word.replace("old", "new") for word in text.split()) replaces within words.

Replacing substrings with replace() cleans, normalizes, and transforms text efficiently — single or limited replacements, vectorized in pandas/Polars. In 2026, use count for control, regex for patterns, vectorize for scale, and add type hints for safety. Master replace, and you’ll sanitize, redact, standardize, and prepare text data quickly and correctly.

Next time you need to swap one substring for another — reach for replace(). It’s Python’s cleanest way to say: “Find this and change it to that.”

Generating content...