len()

len() is one of Python’s most frequently used built-in functions — it returns the number of items (length) in an object that supports the length protocol (__len__()), such as strings, lists, tuples, dictionaries, sets, ranges, NumPy arrays, pandas Series/DataFrames, Polars DataFrames, and Dask objects. In 2026, len() remains a cornerstone in data science (checking DataFrame rows/columns, array sizes), software engineering (input validation, loop bounds), and performance-critical code — fast (O(1) for most built-ins), readable, and universally supported across Python’s data ecosystem.

Here’s a complete, practical guide to using len() in Python: basic length checks, common types & behaviors, real-world patterns (earthquake DataFrame inspection, chunk sizing, validation), and modern best practices with type hints, performance, edge cases, and integration with pandas/Polars/Dask/NumPy/xarray.

Basic len() usage — length of strings, lists, tuples, dicts, sets.


print(len("Hello, World!"))      # 13 (characters)
print(len([1, 2, 3, 4]))         # 4 (items)
print(len((10, 20)))             # 2
print(len({"a": 1, "b": 2}))     # 2 (key count)
print(len({1, 2, 3, 3}))         # 3 (unique items)
print(len(range(100)))           # 100
print(len(""))                   # 0 (empty string)
print(len([]))                   # 0 (empty list)

len() with data science objects — pandas, Polars, Dask, NumPy, xarray.


import pandas as pd
import polars as pl
import dask.dataframe as dd
import numpy as np
import xarray as xr

df_pd = pd.DataFrame({"mag": [7.2, 6.8, 5.9]})
print(len(df_pd))                # 3 (rows)
print(len(df_pd.columns))        # 1 (columns)

df_pl = pl.DataFrame({"mag": [7.2, 6.8, 5.9]})
print(len(df_pl))                # 3 (rows)
print(len(df_pl.columns))        # 1 (columns)

ddf = dd.from_pandas(df_pd, npartitions=2)
print(len(ddf))                  # 3 (rows — computes if needed)

arr = np.array([[1, 2], [3, 4], [5, 6]])
print(len(arr))                  # 3 (first axis length)

ds = xr.Dataset({"mag": (("time",), [7.2, 6.8])})
print(len(ds["mag"]))            # 2 (length along 'time')
print(len(ds.dims))              # 1 (number of dimensions)

Real-world pattern: earthquake data inspection & chunking — use len() for validation & sizing.


import dask.dataframe as dd

ddf = dd.read_csv('earthquakes/*.csv', blocksize='64MB')

# Basic inspection
print(f"Total events: {len(ddf)}")                    # computes row count
print(f"Columns: {len(ddf.columns)}")                 # column count (fast)
print(f"Partitions: {ddf.npartitions}")               # Dask-specific

# Validate required columns
required = ['time', 'mag', 'latitude', 'longitude', 'depth']
missing = [col for col in required if col not in ddf.columns]
if missing:
    print(f"Missing columns: {missing}")
else:
    print("All required columns present")

# Chunk-aware processing
for i, chunk in enumerate(ddf.to_delayed()):
    df_chunk = chunk.compute()
    print(f"Chunk {i+1}: {len(df_chunk)} rows")
    strong = df_chunk[df_chunk['mag'] >= 7.0]
    if len(strong) > 0:
        print(f"  Strong events in chunk {i+1}: {len(strong)}")

Best practices for len() in Python & data workflows. Prefer len(obj) — over obj.__len__() (cleaner, safer). Modern tip: use Polars df.shape[0] — for row count; Dask len(ddf) computes lazily. Use len(df.columns) — for column count (pandas/Polars). Use len(df) — for rows in pandas/Dask. Use len(arr.shape) — for number of dimensions in NumPy/xarray. Add type hints — def check_length(seq: Iterable[Any]) -> int: return len(seq). Avoid len() on generators — consumes them; use sum(1 for _ in gen) instead. Use len() in assertions — assert len(df) > 0. Use len() with enumerate() — for i, item in enumerate(seq): ... if i == len(seq)-1. Use len(set(seq)) — for unique count. Use len(df.dropna()) — for non-null rows. Use len(df.query('mag >= 7.0')) — filtered count (pandas). Use df.shape[0] — preferred over len(df) in pandas for clarity. Use len(ddf.compute()) — careful with large Dask objects (materializes). Use ddf.shape[0].compute() — Dask row count. Use pl.DataFrame.shape[0] — Polars row count. Use np.size(arr) — total elements in NumPy array. Use arr.shape[0] — first axis length in NumPy/xarray. Use len(ds.dims) — number of dimensions in xarray.

len(obj) returns the number of items in an object — strings (chars), lists/tuples (elements), dicts/sets (keys/items), DataFrames (rows), arrays (first axis). In 2026, use for validation, sizing, chunking, and integrate with pandas/Polars/Dask/NumPy for data inspection. Master len(), and you’ll write concise, efficient code for any collection or data structure.

Next time you need to know “how many?” — use len(). It’s Python’s cleanest way to say: “Tell me the size of this thing — fast and reliable.”

Generating content...