%mprun output from the memory_profiler package is one of the most precise tools for understanding memory usage line by line in Python functions. While cProfile shows runtime hotspots and %timeit measures execution speed, %mprun tracks incremental memory allocation (in MiB) at each line — revealing exactly where your code consumes RAM, whether from lists, DataFrames, copies, or temporary objects. In 2026, line-level memory profiling is essential for avoiding OOM crashes, optimizing large data processing, reducing peak memory in production pipelines, and writing memory-efficient code in pandas, Polars, machine learning, or streaming applications.
Here’s a complete, practical guide to interpreting and using %mprun output: what each column means, identifying leaks and hotspots, real-world examples, and modern best practices for reducing memory footprint in notebooks and scripts.
Typical %mprun output looks like this (from profiling a function that creates and sorts a large list):
Filename: example.py
Line # Mem usage Increment Line Contents
================================================
1 54.4 MiB 54.4 MiB @profile
2 54.4 MiB 0.0 MiB def my_function():
3 130.7 MiB 76.3 MiB nums = [random.randint(0, 100) for _ in range(1_000_000)]
4 207.0 MiB 76.3 MiB sorted_nums = sorted(nums)
5 207.0 MiB 0.0 MiB return sorted_nums
Breakdown of key columns:
- Line # — line number in the function
- Mem usage — total memory used up to and including that line (cumulative)
- Increment — memory added by that line alone (key for spotting allocations)
- Line Contents — the actual code on that line
Hotspots are obvious: here, line 3 (list creation) and line 4 (sorting) each add ~76 MiB — a classic sign of opportunity for generators, chunking, or in-place operations to cut peak memory in half.
Real-world pattern: profiling pandas data loading and processing — %mprun shows where memory spikes (full DataFrame vs. chunked reading) and guides optimization.
# Install once: !pip install memory_profiler
%load_ext memory_profiler
@profile
def load_and_process():
df = pd.read_csv("large.csv") # Big spike
df["new_col"] = df["value"] ** 2 # Another spike
return df.groupby("category")["new_col"].sum()
load_and_process()
# Fix: chunked reading — memory stays flat
@profile
def load_chunked():
total = 0.0
for chunk in pd.read_csv("large.csv", chunksize=100_000):
chunk["new_col"] = chunk["value"] ** 2
total += chunk.groupby("category")["new_col"].sum()
return total
Best practices make %mprun output actionable and reliable. Install memory_profiler and load the extension (%load_ext memory_profiler) — decorate functions with @profile and run %mprun -f function_name function_call(). Focus on large Increments — they show allocations (lists, DataFrames, copies) to target first. Prefer generators (yield) or chunking over full lists — %mprun often shows 2–10× peak reduction. Modern tip: use tracemalloc (built-in) for quick snapshots — tracemalloc.start(); ...; snapshot = tracemalloc.take_snapshot() — no decorator needed, great for scripts or non-notebook code. Combine with pandas/Polars — use pd.read_csv(chunksize=...) or pl.scan_csv(...).collect(streaming=True) — profiling shows they keep memory flat. In production, profile on representative data — small inputs hide leaks; profile in release mode (no debug assertions). Visualize — use scalene or memray for flame graphs or memory timelines. Avoid over-profiling — profile hotspots first (from cProfile or timeit), then drill down with memory_profiler.
%mprun output turns “my code uses too much memory” into “here’s exactly which line spiked — and how to fix it.” In 2026, profile early and often, target large Increments, use generators/chunking, and track peak memory over time. Master line-level memory profiling, and you’ll write code that scales to massive data without crashing — because memory is a resource, not an unlimited gift.
Next time your code eats too much RAM — don’t guess. Run %mprun. It’s Python’s cleanest way to ask: “Where is my memory going?” — and get an exact answer.