Using generator functions is one of Python’s most powerful techniques for producing sequences lazily — functions that use yield to return values one at a time on demand, instead of computing and storing everything upfront. When called, a generator function returns a generator object (an iterator) that can be iterated over with for, next(), or consumed by functions like sum(), list(), or max(). Each yield pauses execution and sends a value back; the function resumes on the next request. This lazy, memory-efficient behavior makes generators ideal for large/infinite sequences, streaming data, file processing, and pipelines where you don’t need (or can’t afford) to store the entire result.
In 2026, generator functions are a cornerstone of scalable Python code — used constantly for data streams, custom iterators, coroutines, and memory-safe processing in data science, APIs, logs, and big data tools like Polars. Here’s a complete, practical guide to building and using generator functions: basic yield, stateful generators, real-world patterns, and modern best practices with type hints and safety.
The simplest generator yields a sequence of values — execution pauses at each yield and resumes when the next value is requested.
def squares_up_to(n: int) -> Iterator[int]:
"""Yield squares of numbers from 0 to n-1."""
for i in range(n):
yield i ** 2
# Use the generator
for sq in squares_up_to(5):
print(sq) # 0 1 4 9 16
# Or consume partially
gen = squares_up_to(10)
print(next(gen)) # 0
print(next(gen)) # 1
# ... continues on demand
Classic example: Fibonacci sequence — generates numbers lazily, no need to store the entire series in memory.
def fibonacci(n: int) -> Iterator[int]:
"""Yield the first n Fibonacci numbers."""
a, b = 0, 1
for _ in range(n):
yield a
a, b = b, a + b
# Print first 10
print(list(fibonacci(10))) # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
# Or sum first 20 without storing list
total = sum(fibonacci(20))
print(total) # 6765
Real-world pattern: streaming large files or API results — yield processed lines or records one at a time for low-memory processing.
def valid_lines(file_path: str) -> Iterator[str]:
"""Yield only non-empty, valid lines from a huge file."""
with open(file_path, "r", encoding="utf-8") as f:
for line in f:
clean = line.strip()
if clean and clean.isdigit(): # Example validation
yield clean
# Process only valid lines
for line in valid_lines("huge_numbers.txt"):
num = int(line)
# Do something (e.g., sum, validate, insert to DB)
print(f"Valid number: {num}")
Another powerful pattern: stateful generators that remember values across yields — great for accumulators, running totals, or custom sequences.
def running_average() -> Iterator[float]:
"""Yield running average of numbers sent via send()."""
total = 0.0
count = 0
while True:
value = yield total / count if count > 0 else 0.0
total += value
count += 1
avg_gen = running_average()
next(avg_gen) # Prime the generator (required for send())
print(avg_gen.send(10)) # 10.0
print(avg_gen.send(20)) # 15.0
print(avg_gen.send(30)) # 20.0
Best practices make generator functions safe, readable, and performant. Use type hints — annotate as Iterator[T] or Generator[T, None, None] — improves IDE support and mypy checks. Write clear docstrings — describe what’s yielded and any send() behavior. Use yield from to delegate to sub-generators — avoids nested loops and improves performance. Avoid side effects in simple generators — keep them pure when possible; use regular functions for heavy I/O. Modern tip: use @contextmanager for resource management in generators, and combine with itertools (islice, takewhile, chain) for advanced streaming. In production, wrap generators over external data (files, APIs) in try/except — handle errors per yield gracefully. Prefer generators over lists for large/one-pass data — convert to list only when needed (list(gen)).
Generator functions with yield are Python’s way to produce sequences lazily — memory-safe, scalable, and elegant for big data, streams, and custom iterators. In 2026, build them with type hints, keep them simple, and use yield from for delegation. Master generators, and you’ll process massive datasets, files, and streams with confidence and low memory footprint.
Next time you need to produce a sequence without storing it all — write a generator function with yield. It’s Python’s cleanest way to say: “Here’s the next value, whenever you’re ready.”