Kafka: Zero-Copy and Why It's Fast (2015)

#Kafka#Linux#Performance#Java

📋 Table of Contents ▼

Kafka: Zero-Copy and Why It's Fast

Most people think of Kafka as a "message queue," but it's really a distributed commit log. Its performance doesn't come from complex in-memory structures; it comes from treating the filesystem as a high-performance buffer.

The Standard I/O Path

Normally, to send a file over a socket:

Kernel reads data from disk to kernel buffer.
App reads data from kernel buffer to user space buffer.
App writes data from user space buffer back to kernel socket buffer.
Kernel sends data to the NIC.

That's 4 context switches and 2 unnecessary copies.

The Zero-Copy Path

Kafka uses the sendfile() system call (via Java's FileChannel.transferTo()).

// Java NIO Zero-Copy
public void transferTo(FileChannel source, long position, long count, 
                       WritableByteChannel target) {
    source.transferTo(position, count, target);
}

This tells the kernel: "Take these bytes from this file descriptor and shove them directly into that socket descriptor." The data stays in kernel space. No user-space context switch, no extra copies.

Sequential I/O

Kafka also relies on the fact that sequential I/O on modern disks is surprisingly fast-often comparable to random RAM access. By only appending to the end of logs and avoiding random seeks, Kafka lets the OS disk cache (page cache) do all the heavy lifting. This is why Kafka works better with more RAM, even if the JVM heap is small.

Aunimeda provides DevOps engineering and infrastructure services - CI/CD pipelines, containerization, cloud deployments, and monitoring setups.

Kafka: Zero-Copy and Why It's Fast (2015)

Kafka: Zero-Copy and Why It's Fast

The Standard I/O Path

The Zero-Copy Path

Sequential I/O

Aunimeda

Read Also

The 2008 Scaling Crisis: Caching at the Edge with Memcached

Memcached: Slab Allocation Internals (2007)

Xen: The Magic of Paravirtualization (2005)

Need IT development for your business?