collections.Counter()

collections.Counter() is Python’s built-in, high-performance way to count occurrences of hashable elements in any iterable — lists, strings, tuples, files, API results, or even other Counters. It returns a dict-like object where keys are the elements and values are their counts, with a rich API for arithmetic, most common items, and more. In 2026, Counter remains essential — faster and more readable than manual dict counting or loops, and perfect for frequency analysis, data cleaning, statistics, text processing, and production pipelines handling large datasets.

Here’s a complete, practical guide to using collections.Counter: basic counting, accessing results, advanced features, real-world patterns, and modern best practices with type hints, performance, and scalability.

Creating a Counter is simple — pass any iterable; it counts each element automatically.


from collections import Counter

words = ["apple", "banana", "orange", "apple", "kiwi", "banana"]
word_counts = Counter(words)
print(word_counts)
# Counter({'apple': 2, 'banana': 2, 'orange': 1, 'kiwi': 1})

Access counts like a dict — missing keys return 0 (no KeyError), making it safe and convenient.


print(word_counts["apple"])    # 2
print(word_counts["mango"])    # 0 (no error)
print(word_counts.get("pear", 0))  # 0 (same as dict)

Real-world pattern: frequency analysis in text, logs, or data streams — Counter excels at tallying words, IDs, categories, or events efficiently.


# Count errors in a large log file — memory-safe line-by-line
error_counts = Counter()
with open("huge_log.txt", "r", encoding="utf-8") as f:
    for line in f:
        if "ERROR" in line.upper():
            # Extract error code (example)
            code = line.split("ERROR:")[1].split()[0].strip()
            error_counts[code] += 1

print("Top errors:", error_counts.most_common(5))

Advanced features: arithmetic (+, -, & for intersection, | for union), most_common(n) for top items, elements() to reconstruct the multiset, and subtract() for diffs.


c1 = Counter("abracadabra")   # Counter({'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})
c2 = Counter("banana")         # Counter({'a': 3, 'n': 2, 'b': 1})

print(c1 + c2)                 # Union (add counts): {'a': 8, 'r': 2, 'b': 3, ...}
print(c1 - c2)                 # Subtract (non-negative): {'a': 2, 'r': 2, ...}
print(c1 & c2)                 # Intersection (min counts): {'a': 3, 'b': 1}
print(c1 | c2)                 # Union (max counts): {'a': 5, 'r': 2, ...}

print(c1.most_common(3))       # [('a', 5), ('r', 2), ('b', 2)]
print(list(c1.elements()))     # ['a', 'a', 'a', 'a', 'a', 'r', 'r', 'b', 'b', 'c', 'd']

Best practices make Counter usage fast, safe, and scalable. Prefer Counter over manual dict counting — it’s optimized, handles missing keys gracefully, and has a rich API. Use Counter with generators for large data — Counter(x for x in huge_iterable if condition) — constant memory. Add type hints for clarity — Counter[str] or Counter[int] — improves readability and mypy checks. Modern tip: use Polars for large tabular data — df.select(pl.col("col").value_counts()) or df.group_by("col").agg(pl.count()) is 10–100× faster than Counter on millions of rows. In production, wrap counting over external data (files, APIs) in try/except — handle bad items gracefully. Combine with zip() for paired counting — Counter(zip(ids, values)). Use most_common() for top-N — avoids sorting everything manually.

collections.Counter() turns counting into a fast, readable, and feature-rich operation — no manual loops, no KeyError worries, just clean frequency analysis. In 2026, use it with generators for scale, type hints for safety, and Polars for big data. Master Counter, and you’ll tally occurrences efficiently — whether small lists or massive streams.

Next time you need to count anything — reach for Counter. It’s Python’s cleanest way to say: “How many of each?” — with speed and simplicity.

Generating content...