Subsetting by row and column number is a core pandas skill — it lets you select specific positions in a DataFrame using integer-based indexing, ignoring labels. This is especially useful when column or row order is known but names are not, or when working with positional data (e.g., first 100 rows, columns 3–7).
In 2026, use .iloc[] for pure position-based subsetting — it's fast, predictable, and avoids label confusion. Here’s a practical guide with real examples you can copy and adapt.
1. Basic Setup & Sample Data
import pandas as pd
data = {
'col1': ['A', 'B', 'C', 'D', 'E'],
'col2': [1, 2, 3, 4, 5],
'col3': ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
}
df = pd.DataFrame(data)
print(df)
Output:
col1 col2 col3
0 A 1 alpha
1 B 2 beta
2 C 3 gamma
3 D 4 delta
4 E 5 epsilon
2. Subsetting Rows and Columns by Position with .iloc[]
.iloc[row_slice, column_slice] — stop is exclusive, supports steps and negatives.
# Rows 1 to 3 (exclusive stop), columns 0 to 2 (col1 and col2)
subset = df.iloc[1:4, 0:2]
print(subset)
Output:
col1 col2
1 B 2
2 C 3
3 D 4
# Every other row from 0 to end, columns 1 to end
every_other = df.iloc[::2, 1:]
print(every_other)
Output:
col2 col3
0 1 alpha
2 3 gamma
4 5 epsilon
3. More Flexible Position-Based Slicing
Last 3 rows, first 2 columns
last_three_first_two = df.iloc[-3:, :2]
Columns 0 and 2 only (non-contiguous)
non_contiguous = df.iloc[:, [0, 2]]
First 4 rows, last column
first_four_last = df.iloc[:4, -1]
4. Real-World Use Cases (2026 Examples)
Selecting first N rows and specific columns for modeling
features = df.iloc[:1000, 1:5] # first 1000 rows, columns 1–4
Quick sampling (every 10th row, all columns)
sample = df.iloc[::10, :]
Removing header/metadata rows
clean_df = df.iloc[5:, :] # skip first 5 rows
5. Modern Alternative in 2026: Polars
For large datasets, Polars is often faster and more memory-efficient — positional slicing is very similar.
import polars as pl
df_pl = pl.DataFrame(data)
# Rows 1 to 3 (0-based), columns 0 to 2
subset_pl = df_pl[1:4, 0:2]
print(subset_pl)
Best Practices & Common Pitfalls
- Use
.iloc[]for position —.loc[]is for labels (can be confusing with integers) - Stop is exclusive —
df.iloc[0:3, 0:2]gives rows 0–2, columns 0–1 - Negative indices count from end —
df.iloc[-5:]= last 5 rows - Non-contiguous columns need lists —
df.iloc[:, [0, 2, 4]] - Check shape first:
df.shape— avoids IndexError - For huge data, prefer Polars slicing — it's faster and more memory-efficient
Conclusion
Subsetting by row and column number with .iloc[] gives you precise, position-based control over DataFrames — perfect when labels are unreliable or unknown. In 2026, prefer .iloc[] for positional work, use .loc[] when labels matter, and reach for Polars when scale is critical. Master start/stop/step, negative indexing, and bounds checking, and you'll slice data with speed and accuracy every time.
Next time you need rows 100–200 and columns 3–7 — use .iloc first.