AboutBlogContact
DatabasesDecember 10, 2014 2 min read 23

InfluxDB: TSM Engine and the Cardinality Trap (2014)

AunimedaAunimeda

InfluxDB: TSM Engine and the Cardinality Trap

InfluxDB started with LevelDB, then tried BoltDB, but finally settled on their own Time-Structured Merge Tree (TSM) engine. TSM is designed specifically for the write-heavy, scan-heavy nature of time-series data.

The TSM Layout

A TSM file is essentially a sorted collection of compressed series data. It uses adaptive compression (like Gorilla or Delta-of-Delta) to shrink timestamps and values.

// Conceptual delta-of-delta encoding
func compressTimestamps(timestamps []int64) []byte {
    delta := timestamps[1] - timestamps[0]
    // Store only the change in the delta
}

The Cardinality Problem

The biggest pitfall in InfluxDB is Series Cardinality. A "series" is defined by the combination of your measurement name and your tag set.

# Low Cardinality
cpu,host=server1 value=0.5

# High Cardinality (The Trap!)
cpu,host=server1,user_id=827364 value=0.5

If you put a unique ID (like a session ID or user ID) in a tag, you create a new series for every single user. InfluxDB keeps an index of all series in memory. If your cardinality explodes, the index will consume all your RAM and the OOM killer will pay you a visit.

Always keep your tags for metadata that has a limited set of values (continents, hostnames, app versions). For high-cardinality data, use Fields, which are not indexed.

Read Also

Redis: RDB vs. AOF Persistence (2009)aunimeda
Databases

Redis: RDB vs. AOF Persistence (2009)

Redis is fast because it's in-memory, but what happens when the power goes out? Choosing between RDB and AOF is a classic trade-off.

CouchDB: Scaling with MapReduce and Incremental Views (2009)aunimeda
Databases

CouchDB: Scaling with MapReduce and Incremental Views (2009)

2009 is the year of the 'NoSQL' movement. CouchDB is leading the charge with its document-based storage and powerful MapReduce indexing system.

MongoDB: When Your Data Doesn't Fit in a Tableaunimeda
Databases

MongoDB: When Your Data Doesn't Fit in a Table

The 10gen team has released MongoDB. It's 'humongous' (supposedly), it's NoSQL, and it uses JSON. Is the relational era over?

Need IT development for your business?

We build websites, mobile apps and AI solutions. Free consultation.

Get Consultation All articles