Reading date and time data in Pandas

Reading date and time data in Pandas is one of the most common and important tasks in data analysis — correctly parsing timestamps, dates, or time columns ensures you can perform time-based operations like resampling, shifting, grouping by periods, calculating durations, or plotting trends. Pandas provides powerful tools via pd.read_csv(), pd.to_datetime(), and the .dt accessor to read, parse, and manipulate datetime data efficiently. In 2026, mastering datetime parsing in pandas remains essential — especially with mixed formats, time zones, large files, or streaming data — and Polars offers even faster alternatives for massive datasets.

Here’s a complete, practical guide to reading and working with date/time data in pandas: parsing during import, converting strings to datetime, handling time zones, extracting components, vectorized operations, real-world patterns, and modern best practices with Polars comparison and scalability.

The easiest way to parse dates/times is during import with parse_dates — pandas automatically converts specified columns to datetime64.


import pandas as pd

# Read CSV and parse date/time columns automatically
df = pd.read_csv("data.csv", parse_dates=["Date", "Timestamp"])

print(df.dtypes)
# Date         datetime64[ns]
# Timestamp    datetime64[ns]
# ...

If formats are non-standard or mixed, use pd.to_datetime() with explicit format — much faster and safer than letting pandas infer.


# Custom format parsing
df["Event Time"] = pd.to_datetime(df["Event Time"], format="%Y-%m-%d %H:%M:%S")

# Mixed formats — use infer_datetime_format or errors='coerce'
df["Mixed Dates"] = pd.to_datetime(df["Mixed Dates"], errors="coerce")
df["Mixed Dates"] = df["Mixed Dates"].dt.tz_localize("UTC")  # attach timezone

Extract date/time components using the .dt accessor — vectorized, no loops needed.


# Add year, month, day, weekday, hour, etc.
df["Year"] = df["Timestamp"].dt.year
df["Month"] = df["Timestamp"].dt.month
df["Day"] = df["Timestamp"].dt.day
df["Weekday"] = df["Timestamp"].dt.weekday   # 0=Monday, 6=Sunday
df["Hour"] = df["Timestamp"].dt.hour
df["Is Weekend"] = df["Timestamp"].dt.weekday >= 5

# Format as string
df["Formatted"] = df["Timestamp"].dt.strftime("%Y-%m-%d %H:%M")

Real-world pattern: reading and cleaning timestamp data from logs, sensors, or APIs — parse on import, handle errors, attach timezone, and extract components for analysis.


# Large log file with mixed timestamps
df = pd.read_csv("huge_logs.csv", parse_dates=["log_time"], date_format="mixed")

# Handle parsing errors and timezone
df["log_time"] = pd.to_datetime(df["log_time"], errors="coerce", utc=True)
df["log_time"] = df["log_time"].dt.tz_convert("America/New_York")  # user local

# Extract useful features
df["date"] = df["log_time"].dt.date
df["hour"] = df["log_time"].dt.hour
df["is_business_hours"] = df["hour"].between(9, 17)

print(df.head())

Best practices for reading and working with datetime data in pandas. Always use parse_dates or pd.to_datetime(format=...) on import — inference is slow and error-prone. Specify utc=True in to_datetime() for timezone-aware parsing — avoid naive datetimes in production. Modern tip: switch to Polars for large files — pl.read_csv("data.csv", try_parse_dates=True) or pl.col("ts").str.to_datetime("%Y-%m-%d %H:%M:%S") is 10–100× faster and memory-efficient. Add type hints — pd.Series[pd.Timestamp] — improves static analysis. Handle parsing errors with errors="coerce" — converts invalid to NaT (Not a Time). Use .dt accessor for vectorized extraction — never apply(lambda x: x.year). For time zones, use dt.tz_convert() or dt.tz_localize() — prefer zoneinfo.ZoneInfo over pytz. Chunk large files — pd.read_csv(chunksize=...) or Polars streaming — keeps memory flat. Combine with resample(), groupby(pd.Grouper(freq="D")) — vectorized aggregation by time periods.

Reading date and time data in pandas sets the foundation for accurate time-series analysis — parse on import, vectorize extractions, handle time zones, and prefer Polars for scale. In 2026, avoid inference, use explicit formats, add type hints, and chunk/stream large files. Master datetime reading in pandas, and you’ll ingest, clean, and analyze time-based data reliably and efficiently.

Next time you load a CSV with dates or timestamps — use parse_dates or to_datetime. It’s pandas’ cleanest way to say: “Turn these strings into real datetimes from the start.”

Generating content...