Redis Data Structures in Production: Beyond SET and GET

#redis#data-structures#performance#backend#caching#queues

📋 Table of Contents ▼

We once had a leaderboard feature that was backed by a PostgreSQL ORDER BY score DESC LIMIT 100 query with a full table scan on 40 million rows. It worked fine at 100 concurrent users and collapsed at 10,000. The fix was a Redis sorted set: O(log N) inserts, O(log N + M) range queries, updates atomic by default. The migration took an afternoon and the query went from 800ms to 0.3ms.

Redis ships with eight core data structures and most engineers use one: strings. This guide covers the ones that actually solve production problems, with real code and the operational gotchas.

Sorted Sets: Leaderboards and Priority Queues

A sorted set stores members with an associated floating-point score. Members are unique; scores can repeat. All range operations are O(log N + M) where M is the result size.

import { Redis } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);

// Add or update a player's score
// ZADD key [NX|XX] [GT|LT] [CH] score member
await redis.zadd('leaderboard:2026', 'GT', 15420, 'user:1234');
// GT: only update if new score is greater (prevents score regression on re-submit)
// NX: only add new members (never update existing)
// CH: return count of changed members instead of added

// Get rank (0-indexed, lowest score first) - use ZREVRANK for highest-first
const rank = await redis.zrevrank('leaderboard:2026', 'user:1234');
// Returns: 0 for #1, 1 for #2, null if not in set

// Get top 10 with scores
const top10 = await redis.zrevrangebyscore(
  'leaderboard:2026',
  '+inf',   // max score
  '-inf',   // min score
  'WITHSCORES',
  'LIMIT', 0, 10
);
// Returns: ['user:5678', '98432', 'user:1234', '85420', ...]
// Alternating member/score - parse into pairs

// Paginated leaderboard page
async function getLeaderboardPage(page: number, pageSize = 25) {
  const start = page * pageSize;
  const stop = start + pageSize - 1;

  const results = await redis.zrevrange(
    'leaderboard:2026', start, stop, 'WITHSCORES'
  );

  // Parse flat array into [{member, score, rank}]
  const entries = [];
  for (let i = 0; i < results.length; i += 2) {
    entries.push({
      member: results[i],
      score: parseFloat(results[i + 1]),
      rank: start + Math.floor(i / 2) + 1,
    });
  }
  return entries;
}

// Get a player's score and rank in one pipeline (no round trip per call)
async function getPlayerStats(userId: string) {
  const pipeline = redis.pipeline();
  pipeline.zscore('leaderboard:2026', `user:${userId}`);
  pipeline.zrevrank('leaderboard:2026', `user:${userId}`);
  const [[, score], [, rank]] = await pipeline.exec() as any;
  return { score: score ? parseFloat(score) : null, rank };
}

Score encoding trick: Sorted sets use a single float64 score. To sort by multiple criteria (e.g., primary: points, tiebreak: submission time), encode both into one number:

// Encode: score * 1e10 + (maxTimestamp - submissionTimestamp)
// This makes higher scores rank higher, and within same score,
// earlier submissions rank higher (lower timestamp = higher tiebreak value)
function encodeScore(points: number, submittedAt: Date): number {
  const MAX_TS = 9999999999; // year 2286 - safe
  const tiebreak = MAX_TS - Math.floor(submittedAt.getTime() / 1000);
  return points * 1e10 + tiebreak;
}

Priority Job Queue with Sorted Sets

Sorted sets make excellent priority queues. Use the priority as the score, and timestamp within priority as tiebreak.

// Enqueue a job with priority (lower number = higher priority)
async function enqueue(queue: string, jobId: string, priority: number) {
  const score = priority * 1e13 + Date.now();
  await redis.zadd(queue, score, jobId);
}

// Dequeue the highest-priority job atomically using ZPOPMIN
async function dequeue(queue: string): Promise<string | null> {
  const result = await redis.zpopmin(queue, 1);
  if (result.length === 0) return null;
  return result[0]; // member (jobId)
}

// For workers that need to claim a job without removing it
// (so other workers don't steal it), use a Lua script - covered below

Rate Limiter: INCR + EXPIRE

The canonical Redis rate limiter. Simple, fast, slightly wrong in one edge case (explained below).

async function checkRateLimit(
  identifier: string,
  maxRequests: number,
  windowSeconds: number
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
  const key = `rate:${identifier}:${Math.floor(Date.now() / 1000 / windowSeconds)}`;

  const pipeline = redis.pipeline();
  pipeline.incr(key);
  pipeline.expire(key, windowSeconds * 2); // 2x window to handle boundary
  const [[, count]] = await pipeline.exec() as any;

  const remaining = Math.max(0, maxRequests - count);
  const resetAt = (Math.floor(Date.now() / 1000 / windowSeconds) + 1) * windowSeconds;

  return {
    allowed: count <= maxRequests,
    remaining,
    resetAt,
  };
}

The edge case: At the window boundary, a user can make maxRequests requests at t=0.99s and maxRequests again at t=1.01s - effectively 2x the limit in 0.1 seconds. For most use cases this is fine. For strict rate limiting, use a sliding window with a sorted set:

async function slidingWindowRateLimit(
  identifier: string,
  maxRequests: number,
  windowMs: number
): Promise<boolean> {
  const key = `ratelimit:sliding:${identifier}`;
  const now = Date.now();
  const windowStart = now - windowMs;

  const pipeline = redis.pipeline();
  // Remove requests older than window
  pipeline.zremrangebyscore(key, '-inf', windowStart);
  // Count requests in window
  pipeline.zcard(key);
  // Add current request
  pipeline.zadd(key, now, `${now}-${Math.random()}`);
  // Set TTL
  pipeline.pexpire(key, windowMs);

  const results = await pipeline.exec() as any;
  const requestCount = results[1][1];

  return requestCount < maxRequests;
}

The sorted set sliding window is correct but uses more memory - O(N) where N is requests per window. At 1000 req/min with 1M users, that's potentially 1B entries. Use the fixed window for high-scale; sliding window for sensitive endpoints like login or payment.

HyperLogLog: UV Counting Without Storing Users

Storing SET user_visited:${pageId} with one member per user ID works, but at scale it uses enormous memory. 1 million unique visitors × 8 bytes per ID = 8MB per page per day. With thousands of pages and 90-day retention, this becomes a serious memory problem.

HyperLogLog counts distinct elements with ~0.81% error using at most 12KB, regardless of cardinality.

// Track a UV (unique visitor)
async function trackPageView(pageId: string, userId: string) {
  const date = new Date().toISOString().slice(0, 10); // YYYY-MM-DD
  const key = `uv:${pageId}:${date}`;

  await redis.pfadd(key, userId);
  await redis.expire(key, 90 * 24 * 60 * 60); // 90-day retention
}

// Get UV count for a page
async function getUniqueVisitors(pageId: string, date: string): Promise<number> {
  return redis.pfcount(`uv:${pageId}:${date}`);
}

// Weekly UV: PFMERGE multiple daily HLLs into one for counting
async function getWeeklyUniqueVisitors(pageId: string, startDate: string): Promise<number> {
  const keys = Array.from({ length: 7 }, (_, i) => {
    const d = new Date(startDate);
    d.setDate(d.getDate() + i);
    return `uv:${pageId}:${d.toISOString().slice(0, 10)}`;
  });

  const mergedKey = `uv:${pageId}:weekly:${startDate}`;
  await redis.pfmerge(mergedKey, ...keys);
  await redis.expire(mergedKey, 3600); // cache the merged result

  return redis.pfcount(mergedKey);
}

Real data: a high-traffic news site we worked with counted UVs across 50,000 articles × 365 days. With sets: ~2.5TB Redis memory. With HyperLogLog: ~220GB. The 0.81% error was acceptable for analytics dashboards; for billing they used an exact counter for paying users only.

Redis Streams: Event Sourcing and Durable Queues

Streams are the most underused Redis data structure. Unlike Pub/Sub (fire and forget), streams persist messages and support consumer groups with acknowledgment - essentially Kafka-lite built into Redis.

// Producer: append event to stream
async function publishEvent(streamKey: string, event: Record<string, string>) {
  // XADD returns the message ID (timestamp-sequence)
  const messageId = await redis.xadd(
    streamKey,
    '*',  // auto-generate ID
    ...Object.entries(event).flat()
  );
  return messageId; // e.g., "1713355200000-0"
}

// Example: publish an order event
await publishEvent('events:orders', {
  type: 'order.created',
  orderId: '8f4a2b',
  userId: '1234',
  amount: '9999',
  timestamp: Date.now().toString(),
});

// Consumer group setup (run once at startup)
async function setupConsumerGroup(stream: string, group: string) {
  try {
    // '$' means start consuming from now (new messages only)
    // '0' means start from the beginning of the stream
    await redis.xgroup('CREATE', stream, group, '$', 'MKSTREAM');
  } catch (err: any) {
    if (!err.message.includes('BUSYGROUP')) throw err; // Group already exists - fine
  }
}

// Consumer: read and process messages
async function runConsumer(stream: string, group: string, consumerId: string) {
  await setupConsumerGroup(stream, group);

  while (true) {
    // Read up to 10 messages, block for 2 seconds if empty
    const messages = await redis.xreadgroup(
      'GROUP', group, consumerId,
      'COUNT', 10,
      'BLOCK', 2000,
      'STREAMS', stream, '>'  // '>' means undelivered messages
    ) as any;

    if (!messages) continue; // timeout, loop again

    for (const [, entries] of messages) {
      for (const [messageId, fields] of entries) {
        const event = Object.fromEntries(
          Array.from({ length: fields.length / 2 }, (_, i) => [fields[i * 2], fields[i * 2 + 1]])
        );

        try {
          await processEvent(event);
          // ACK: remove from pending entries list (PEL)
          await redis.xack(stream, group, messageId);
        } catch (err) {
          console.error('Failed to process', messageId, err);
          // Don't ACK - message stays in PEL for retry or dead-letter handling
        }
      }
    }
  }
}

// Reclaim stuck messages (not ACKed in > 5 minutes by a crashed consumer)
async function reclaimStuckMessages(stream: string, group: string, consumerId: string) {
  const pending = await redis.xpending(stream, group, '-', '+', 100) as any[];

  for (const [messageId, , idleMs] of pending) {
    if (idleMs > 5 * 60 * 1000) { // 5 minutes
      await redis.xclaim(stream, group, consumerId, 5 * 60 * 1000, messageId);
    }
  }
}

Streams vs Pub/Sub decision: Use Pub/Sub for real-time notifications where dropped messages are acceptable (presence indicators, live cursors). Use Streams for anything where message delivery guarantees matter: order events, payment notifications, audit logs.

Lua Scripting for Atomic Operations

Redis is single-threaded, but a multi-command sequence (WATCH/MULTI/EXEC) can still have issues in clustered setups. Lua scripts run atomically on a single node and are the correct tool for "check-then-act" patterns.

// Atomic job claim: only one worker can claim a job at a time
// Without Lua, two workers could both read the job ID before either removes it
const claimJobScript = `
local job = redis.call('ZPOPMIN', KEYS[1], 1)
if #job == 0 then
  return nil
end
local jobId = job[1]
local score = job[2]
-- Store in processing set with expiry timestamp as score
redis.call('ZADD', KEYS[2], ARGV[1], jobId)
return jobId
`;

async function claimJob(queue: string, processingSet: string): Promise<string | null> {
  const expiresAt = Date.now() + 5 * 60 * 1000; // 5 min processing timeout
  const result = await redis.eval(claimJobScript, 2, queue, processingSet, expiresAt);
  return result as string | null;
}

// Complete or requeue expired jobs (run periodically)
async function requeueExpiredJobs(processingSet: string, queue: string) {
  const now = Date.now();
  const expired = await redis.zrangebyscore(processingSet, '-inf', now);

  for (const jobId of expired) {
    await redis.pipeline()
      .zrem(processingSet, jobId)
      .zadd(queue, now, jobId) // re-enqueue with current timestamp as score
      .exec();
  }
}

Memory Optimization: Encoding Thresholds

Redis automatically uses compact encodings for small collections. When you exceed thresholds, it switches to hash tables - using 10-20x more memory. Know the defaults:

Hash: hash-max-listpack-entries 128, hash-max-listpack-value 64
Sorted Set: zset-max-listpack-entries 128, zset-max-listpack-value 64
List: list-max-listpack-size -2 (8KB max per node)

If your sorted sets have < 128 members and all member strings < 64 bytes, Redis stores them as a packed memory-efficient listpack. The moment you hit 129 members, it converts to a skiplist - 10x the memory. For a leaderboard with millions of keys where each key holds only the top 100 for a sub-category, keeping entries below 128 is worth it.

Check what encoding a key is using:

redis-cli OBJECT ENCODING leaderboard:2026
# "listpack" - compact, good
# "skiplist" - hash table, expected for large sets

Need help designing a Redis data layer that handles real production traffic - rate limiting, leaderboards, event queues, and caching that doesn't collapse under load? Aunimeda builds backend systems with Redis as a first-class infrastructure component. See our custom software development services or talk to us about your architecture.

Redis Data Structures in Production: Beyond SET and GET

Sorted Sets: Leaderboards and Priority Queues

Priority Job Queue with Sorted Sets

Rate Limiter: INCR + EXPIRE

HyperLogLog: UV Counting Without Storing Users

Redis Streams: Event Sourcing and Durable Queues

Lua Scripting for Atomic Operations

Memory Optimization: Encoding Thresholds

Aunimeda

Read Also

Node.js + TypeScript: Building a Production REST API from Scratch in 2026

PostgreSQL Performance Optimization: The Practical Guide for 2026

tRPC + Zod: End-to-End Type Safety Without Code Generation

Need IT development for your business?