Iterating with .iloc lets you access and process pandas DataFrame rows (or columns) by integer position — using zero-based indexing — which can be useful for row-by-row operations when vectorized methods aren’t straightforward or when you need positional logic (e.g., comparing consecutive rows). While .iloc is powerful for slicing and single-cell access, using it in explicit loops (especially with iterrows()-style patterns) is usually slower than vectorized alternatives. In 2026, .iloc iteration is best reserved for complex, non-vectorizable logic or small DataFrames — for performance on large data, prefer vectorization, itertuples(), apply, or Polars equivalents.
Here’s a complete, practical guide to iterating with .iloc: basic positional access, row-by-row calculation patterns, performance considerations, real-world use cases, and modern best practices for when to use it (and when to avoid it).
.iloc accesses by integer location — df.iloc[i] returns the i-th row as a Series, df.iloc[i, j] returns a single cell, and slices like df.iloc[0:5] return sub-DataFrames. It’s purely position-based, ignoring index labels.
import pandas as pd
df = pd.DataFrame({
'Team': ['A', 'B', 'C'],
'Wins': [20, 15, 10],
'Games': [30, 25, 20]
})
# Access row 0
print(df.iloc[0])
# Team A
# Wins 20
# Games 30
# Name: 0, dtype: object
# Access cell at row 1, column 2 (Games)
print(df.iloc[1, 2]) # 25
# Slice first two rows
print(df.iloc[0:2])
# Team Wins Games
# 0 A 20 30
# 1 B 15 25
Row-by-row iteration with .iloc — calculate per-row values (e.g., win percentage) and assign them using .at or .iat — works but is slow for large DataFrames due to repeated indexing overhead.
# Inefficient: loop with .iloc
for i in range(len(df)):
wins = df.iloc[i]['Wins']
games = df.iloc[i]['Games']
win_pct = wins / games if games > 0 else 0
df.at[i, 'Win Percentage'] = win_pct
print(df)
# Team Wins Games Win Percentage
# 0 A 20 30 0.666667
# 1 B 15 25 0.600000
# 2 C 10 20 0.500000
Real-world pattern: sequential row processing with state (e.g., running totals, cumulative stats, or comparing consecutive rows) — .iloc is handy when vectorization is hard.
# Running win percentage (cumulative)
df['Cumulative Wins'] = 0
df['Cumulative Games'] = 0
for i in range(len(df)):
if i == 0:
df.at[i, 'Cumulative Wins'] = df.at[i, 'Wins']
df.at[i, 'Cumulative Games'] = df.at[i, 'Games']
else:
df.at[i, 'Cumulative Wins'] = df.at[i-1, 'Cumulative Wins'] + df.at[i, 'Wins']
df.at[i, 'Cumulative Games'] = df.at[i-1, 'Cumulative Games'] + df.at[i, 'Games']
df.at[i, 'Running Win %'] = df.at[i, 'Cumulative Wins'] / df.at[i, 'Cumulative Games']
print(df)
Best practices make .iloc iteration safe and efficient. Avoid .iloc loops on large DataFrames — prefer vectorized operations (df['Win %'] = df['Wins'] / df['Games']) or itertuples() (namedtuples, 10–50× faster than iterrows). Use .iat for single-cell assignment — faster than .at for integer positions. Never modify DataFrame size inside loop — it can cause fragmentation or errors; pre-allocate columns. Modern tip: switch to Polars for large data — df.with_row_count().with_columns(...) or df.with_columns(pl.col("Wins") / pl.col("Games")) is 10–100× faster than pandas iteration. Add type hints — pd.DataFrame with column types — improves static analysis. In production, profile with timeit or cProfile — iteration is often the bottleneck. Use chunking for huge files — pd.read_csv(chunksize=...) or Polars streaming — keeps memory flat. Combine with shift() for lag/lead comparisons — df['Prev Wins'] = df['Wins'].shift(1) — vectorized and fast.
Iterating with .iloc gives positional control when vectorization isn’t possible — but use it sparingly. In 2026, prefer vectorized ops, itertuples(), Polars, and chunking for speed and memory safety. Master when to iterate vs. vectorize, and you’ll process tabular data efficiently — fast, clean, and at scale.
Next time you need row-by-row positional access — reach for .iloc carefully. It’s pandas’ way to say: “I’ll give you the row by number — but consider vectorizing first.”