PyInns - Home

A list of dictionaries — by row is one of the most natural and flexible ways to represent tabular data in Python before it becomes a DataFrame. Each dictionary is a row, with keys as column names and values as cell data. This format appears everywhere: JSON API responses, database cursors, manual data collection, or when building data incrementally.

In 2026, you still see this structure constantly — especially when working with semi-structured data, streaming inputs, or before feeding into pandas/Polars. Here’s a complete, practical guide: creating, modifying, converting, and using list-of-dicts efficiently.

1. Creating a List of Dictionaries (By Row)


# Classic by-row style — each dict is one complete row
data = [
    {'name': 'John',   'age': 30, 'gender': 'M', 'city': 'New York'},
    {'name': 'Jane',   'age': 25, 'gender': 'F', 'city': 'Chicago'},
    {'name': 'Mike',   'age': 35, 'gender': 'M', 'city': 'Los Angeles'},
    {'name': 'Susan',  'age': 40, 'gender': 'F', 'city': 'Seattle'},
    {'name': 'Tom',    'age': 45, 'gender': 'M', 'city': 'Boston'}
]

# Quick preview
for row in data[:3]:
    print(row)

**Typical output:**

{'name': 'John', 'age': 30, 'gender': 'M', 'city': 'New York'}
{'name': 'Jane', 'age': 25, 'gender': 'F', 'city': 'Chicago'}
{'name': 'Mike', 'age': 35, 'gender': 'M', 'city': 'Los Angeles'}

2. Adding & Modifying Rows Dynamically


# Add a new row (append dict)
new_row = {'name': 'Emma', 'age': 29, 'gender': 'F', 'city': 'Austin'}
data.append(new_row)

# Modify an existing row by index and key
data[1]['age'] = 26  # Jane's age updated
data[3]['city'] = 'Denver'  # Susan moved

# Or safer: use get() to avoid KeyError
data[2].setdefault('salary', 75000)  # add if missing

print("Updated data (last 2 rows):")
for row in data[-2:]:
    print(row)

3. Converting to DataFrame (Most Common Next Step)


import pandas as pd

# Easiest & fastest way
df = pd.DataFrame(data)
print(df.head())

# Or with Polars (2026 speed favorite for large lists)
import polars as pl
df_pl = pl.DataFrame(data)
print(df_pl.head())

4. From List of Dicts to Other Formats (JSON, CSV, etc.)


import json

# To JSON string (common for APIs)
json_str = json.dumps(data, indent=2)
print("JSON output:\n", json_str[:200], "...")

# To CSV string (no file needed)
csv_str = pd.DataFrame(data).to_csv(index=False)
print("\nCSV preview:\n", csv_str.splitlines()[:4])

5. Common Gotchas & Best Practices (2026 Edition)

Missing keys ? dictionaries can have different keys ? use pd.DataFrame(data) (pandas fills missing with NaN) or pl.DataFrame(data) (Polars same)
Order of keys ? Python 3.7+ preserves insertion order — safe for columns
Performance ? building large lists ? prefer list append over repeated insert(0, …)
Validation ? before converting, check all(len(d) == len(data[0]) for d in data[1:]) or use schema tools (pydantic, pandera)
Nested data ? if dicts contain lists/dicts ? flatten first or use pd.json_normalize()
Production tip ? log row count before/after conversion — catch silent data loss

Conclusion

A list of dictionaries — by row — is Python’s most intuitive way to represent records before they become a DataFrame. It’s perfect for incremental building, API responses, JSON parsing, or manual data entry. In 2026, create them naturally, append/modify safely, convert to pandas/Polars quickly, and watch for missing keys or performance on large lists. Master this structure, and you’ll handle semi-structured data with confidence — from raw input to clean analysis in seconds.

Next time you get JSON rows, API results, or build data step-by-step — start with a list of dictionaries. It’s the bridge between raw Python and powerful DataFrames.