Counting Occurrences in Python – Complete Guide for Data Science 2026
Counting how many times a substring or pattern appears in text is one of the most common and powerful operations in data science. Whether you are analyzing word frequencies, counting error codes in logs, measuring keyword density, or preparing features for machine learning models, efficient counting techniques save time and improve accuracy. In 2026, Python offers several elegant methods — from simple string methods to Counter and regex-based counting — that make this task fast and readable.
TL;DR — Best Counting Methods
str.count(sub)→ simple substring countcollections.Counter→ count multiple items efficientlyre.findall(pattern)+len()→ count regex matchesre.finditer(pattern)→ count with positions- pandas
.str.count()→ vectorized counting on DataFrames
1. Basic Counting with str.count()
text = "Data science is great for data analysis and data visualization"
print(text.count("data")) # 3 (case-sensitive)
print(text.lower().count("data")) # 4 (case-insensitive)
2. Powerful Counting with collections.Counter
from collections import Counter
words = text.lower().split()
word_count = Counter(words)
print(word_count.most_common(5)) # Top 5 most frequent words
3. Regex-Based Counting
import re
# Count specific patterns
pattern = r"data"
matches = re.findall(pattern, text.lower())
print(len(matches))
# Count with positions using finditer
for match in re.finditer(r"data", text.lower()):
print(f"Found at position {match.start()}")
4. Real-World Data Science Examples with Pandas
import pandas as pd
df = pd.read_csv("customer_data.csv")
# Count keyword occurrences in descriptions
df["data_count"] = df["description"].str.lower().str.count("data")
# Count multiple patterns at once
df["keyword_count"] = df["description"].str.lower().str.count(r"data|science|machine learning")
5. Best Practices in 2026
- Use
str.count()for simple substring counting - Use
Counterwhen counting multiple different items - Use regex counting for complex patterns
- Always normalize case before counting unless case-sensitivity is required
- Use pandas
.str.count()for vectorized counting on large datasets
Conclusion
Counting occurrences is a foundational skill that bridges basic string operations and advanced Regular Expressions. In 2026 data science projects, the combination of str.count(), Counter, regex findall(), and pandas .str.count() gives you fast, flexible, and scalable tools for text analysis. These techniques make your data cleaning, feature engineering, and pattern detection pipelines cleaner and more efficient.
Next steps:
- Review your current text columns and apply counting techniques to extract frequency features