Iterating with .iloc

Iterating with .iloc lets you access and process pandas DataFrame rows (or columns) by integer position — using zero-based indexing — which can be useful for row-by-row operations when vectorized methods aren’t straightforward or when you need positional logic (e.g., comparing consecutive rows). While .iloc is powerful for slicing and single-cell access, using it in explicit loops (especially with iterrows()-style patterns) is usually slower than vectorized alternatives. In 2026, .iloc iteration is best reserved for complex, non-vectorizable logic or small DataFrames — for performance on large data, prefer vectorization, itertuples(), apply, or Polars equivalents.

Here’s a complete, practical guide to iterating with .iloc: basic positional access, row-by-row calculation patterns, performance considerations, real-world use cases, and modern best practices for when to use it (and when to avoid it).

.iloc accesses by integer location — df.iloc[i] returns the i-th row as a Series, df.iloc[i, j] returns a single cell, and slices like df.iloc[0:5] return sub-DataFrames. It’s purely position-based, ignoring index labels.


import pandas as pd

df = pd.DataFrame({
    'Team': ['A', 'B', 'C'],
    'Wins': [20, 15, 10],
    'Games': [30, 25, 20]
})

# Access row 0
print(df.iloc[0])
# Team          A
# Wins         20
# Games        30
# Name: 0, dtype: object

# Access cell at row 1, column 2 (Games)
print(df.iloc[1, 2])   # 25

# Slice first two rows
print(df.iloc[0:2])
#   Team  Wins  Games
# 0    A    20     30
# 1    B    15     25

Row-by-row iteration with .iloc — calculate per-row values (e.g., win percentage) and assign them using .at or .iat — works but is slow for large DataFrames due to repeated indexing overhead.


# Inefficient: loop with .iloc
for i in range(len(df)):
    wins = df.iloc[i]['Wins']
    games = df.iloc[i]['Games']
    win_pct = wins / games if games > 0 else 0
    df.at[i, 'Win Percentage'] = win_pct

print(df)
#   Team  Wins  Games  Win Percentage
# 0    A    20     30        0.666667
# 1    B    15     25        0.600000
# 2    C    10     20        0.500000

Real-world pattern: sequential row processing with state (e.g., running totals, cumulative stats, or comparing consecutive rows) — .iloc is handy when vectorization is hard.


# Running win percentage (cumulative)
df['Cumulative Wins'] = 0
df['Cumulative Games'] = 0

for i in range(len(df)):
    if i == 0:
        df.at[i, 'Cumulative Wins'] = df.at[i, 'Wins']
        df.at[i, 'Cumulative Games'] = df.at[i, 'Games']
    else:
        df.at[i, 'Cumulative Wins'] = df.at[i-1, 'Cumulative Wins'] + df.at[i, 'Wins']
        df.at[i, 'Cumulative Games'] = df.at[i-1, 'Cumulative Games'] + df.at[i, 'Games']
    df.at[i, 'Running Win %'] = df.at[i, 'Cumulative Wins'] / df.at[i, 'Cumulative Games']

print(df)

Best practices make .iloc iteration safe and efficient. Avoid .iloc loops on large DataFrames — prefer vectorized operations (df['Win %'] = df['Wins'] / df['Games']) or itertuples() (namedtuples, 10–50× faster than iterrows). Use .iat for single-cell assignment — faster than .at for integer positions. Never modify DataFrame size inside loop — it can cause fragmentation or errors; pre-allocate columns. Modern tip: switch to Polars for large data — df.with_row_count().with_columns(...) or df.with_columns(pl.col("Wins") / pl.col("Games")) is 10–100× faster than pandas iteration. Add type hints — pd.DataFrame with column types — improves static analysis. In production, profile with timeit or cProfile — iteration is often the bottleneck. Use chunking for huge files — pd.read_csv(chunksize=...) or Polars streaming — keeps memory flat. Combine with shift() for lag/lead comparisons — df['Prev Wins'] = df['Wins'].shift(1) — vectorized and fast.

Iterating with .iloc gives positional control when vectorization isn’t possible — but use it sparingly. In 2026, prefer vectorized ops, itertuples(), Polars, and chunking for speed and memory safety. Master when to iterate vs. vectorize, and you’ll process tabular data efficiently — fast, clean, and at scale.

Next time you need row-by-row positional access — reach for .iloc carefully. It’s pandas’ way to say: “I’ll give you the row by number — but consider vectorizing first.”

Generating content...