AboutBlogContact
DatabasesDecember 10, 2014 2 min read 121Updated: June 22, 2026

InfluxDB: TSM Engine and the Cardinality Trap (2014)

AunimedaAunimeda
📋 Table of Contents

InfluxDB: TSM Engine and the Cardinality Trap

InfluxDB started with LevelDB, then tried BoltDB, but finally settled on their own Time-Structured Merge Tree (TSM) engine. TSM is designed specifically for the write-heavy, scan-heavy nature of time-series data.

The TSM Layout

A TSM file is essentially a sorted collection of compressed series data. It uses adaptive compression (like Gorilla or Delta-of-Delta) to shrink timestamps and values.

// Conceptual delta-of-delta encoding
func compressTimestamps(timestamps []int64) []byte {
    delta := timestamps[1] - timestamps[0]
    // Store only the change in the delta
}

The Cardinality Problem

The biggest pitfall in InfluxDB is Series Cardinality. A "series" is defined by the combination of your measurement name and your tag set.

# Low Cardinality
cpu,host=server1 value=0.5

# High Cardinality (The Trap!)
cpu,host=server1,user_id=827364 value=0.5

If you put a unique ID (like a session ID or user ID) in a tag, you create a new series for every single user. InfluxDB keeps an index of all series in memory. If your cardinality explodes, the index will consume all your RAM and the OOM killer will pay you a visit.

Always keep your tags for metadata that has a limited set of values (continents, hostnames, app versions). For high-cardinality data, use Fields, which are not indexed.


Aunimeda builds backend systems with optimized database architectures - PostgreSQL, Redis, ClickHouse, and more.

Contact us for backend and database engineering. See also: Custom Software Development

Read Also

PostgreSQL EXPLAIN ANALYZE: Reading Query Plans Like a Senior DBAaunimeda
Databases

PostgreSQL EXPLAIN ANALYZE: Reading Query Plans Like a Senior DBA

Stop guessing why your queries are slow. Learn to read PostgreSQL query plans at a level where you can actually fix problems - seq scans, join strategies, row estimate disasters, and the N+1 you didn't know was hiding in your ORM output.

Drizzle ORM vs Prisma in 2026: A Production Engineer's Honest Comparisonaunimeda
Databases

Drizzle ORM vs Prisma in 2026: A Production Engineer's Honest Comparison

Both ORMs are genuinely good. The choice depends on your migration discipline, whether you hit Prisma's edge runtime limitations, and how much you care about the SQL Drizzle generates vs the DX Prisma provides. Here's the honest comparison - same query, both ORMs, real trade-offs.

Postgres BML: Binary Model Loading and Vector Speed (2025)aunimeda
Databases

Postgres BML: Binary Model Loading and Vector Speed (2025)

Postgres is no longer just for rows. In 2025, BML allows us to load ML models directly into the database for ultra-low latency inference.

Need IT development for your business?

We build websites, mobile apps and AI solutions. Free consultation.

Get Consultation All articles