Indexing

Indexing is the foundation of string manipulation in Python — it lets you access individual characters or extract substrings using square brackets [] with integer positions. Positive indices start from 0 (first character) and go up to len(string) - 1 (last character); negative indices count backward from -1 (last character) to -len(string) (first character). Slicing with [start:end:step] creates substrings efficiently and returns new strings (strings are immutable). In 2026, mastering indexing and slicing remains essential — it's used constantly in text processing, data cleaning, parsing logs/APIs, regex patterns, string formatting, and pandas/Polars string column operations for fast, vectorized extraction.

Here’s a complete, practical guide to indexing strings in Python: basic character access, negative indexing, slicing patterns, real-world examples, performance notes, and modern best practices with type hints, pandas/Polars integration, and error handling.

Basic indexing accesses single characters — out-of-range access raises IndexError.


text = "Hello, world!"

print(text[0])     # 'H' (first character)
print(text[4])     # 'o'
print(text[-1])    # '!' (last character)
print(text[-3])    # 'l' (third from end)

Slicing extracts substrings — [start:end] includes from start up to (but not including) end; [:end] from beginning, [start:] to end, [::step] for skipping.


print(text[7:12])   # 'world' (positions 7 to 11)
print(text[:5])     # 'Hello' (first 5 characters)
print(text[7:])     # 'world!' (from position 7 to end)
print(text[::2])    # 'Hlo ol!' (every second character)
print(text[::-1])   # '!dlrow ,olleH' (reversed string)

Real-world pattern: parsing and cleaning text data — indexing/slicing extracts parts of strings in logs, filenames, URLs, CSV rows, or pandas columns.


# Extract domain from URL
url = "https://www.example.com/path/to/page.html"
domain = url.split("//")[1].split("/")[0]
print(domain)   # www.example.com

# Clean phone number: remove non-digits
phone = "(123) 456-7890"
clean_phone = ''.join(c for c in phone if c.isdigit())
print(clean_phone)   # 1234567890

# pandas vectorized slicing
import pandas as pd
df = pd.DataFrame({'code': ['ABC-123', 'XYZ-456', 'DEF-789']})
df['prefix'] = df['code'].str[:3]      # 'ABC', 'XYZ', 'DEF'
df['number'] = df['code'].str[4:]      # '123', '456', '789'
print(df)

Best practices make indexing/slicing safe, readable, and performant. Use negative indices for end-relative access — text[-1] is clearer than text[len(text)-1]. Prefer slicing over indexing in loops — text[1:-1] removes first/last chars efficiently. Modern tip: use Polars for large text columns — pl.col("text").str.slice(0, 5) or .str.replace(...) is 10–100× faster than pandas .str. Add type hints — str or pd.Series[str] — improves static analysis. Handle out-of-range with checks — if len(text) > 5: text[5] — or use text[5:6] (returns empty string safely). Avoid slicing huge strings repeatedly — use views or memoryview for large text processing. Combine with split()/join() — text.split('-')[1] for delimited extraction. Use regex for complex patterns — re.search(r'\d+', text) — but slicing is faster for fixed positions.

Indexing and slicing give you precise control over strings — access characters, extract substrings, reverse, skip, or clean text efficiently. In 2026, use negative indices, vectorize with .str in pandas/Polars, add type hints, and handle bounds safely. Master indexing, and you’ll parse, clean, and transform text data quickly and correctly.

Next time you need a character or substring — reach for indexing and slicing. It’s Python’s cleanest way to say: “Give me exactly this part of the string.”

Generating content...