The “black technology” behind Kafka, the hottest technology for big data development

Kafka is an open source stream processing platform developed by the Apache Software Foundation. It is widely used in data buffering, asynchronous communication, collecting logs, and system decoupling. Compared with other common messaging systems, Kafka has outstanding performance in terms of high throughput and low latency while guaranteeing most of the functional features. This article is different from other articles about the use or implementation of Kafka. It just talks about what “black technology” Kafka uses to make it have such outstanding performance in terms of performance.

Kafka is an open source stream processing platform developed by the Apache Software Foundation. It is widely used in data buffering, asynchronous communication, collecting logs, and system decoupling. Compared with other common messaging systems, Kafka has outstanding performance in terms of high throughput and low latency while guaranteeing most of the functional features. This article is different from other articles about the use or implementation of Kafka. It just talks about what “black technology” Kafka uses to make it have such outstanding performance in terms of performance.

Messages are written to disk sequentially

Most of the disks are still mechanical structures (SSDs are not within the scope of discussion). If messages are written to disks in a random manner, they need to be addressed in the manner of cylinders, heads, and sectors. Addressing is a “mechanical action” “It’s also the most time-consuming. In order to increase the speed of reading and writing hard disks, Kafka uses sequential I/O.

The “black technology” behind Kafka, the hottest technology for big data development

Figure 1 Kafka sequential IO

In the above figure, each partition is a file, and each message is appended to the partition, which is a sequential write to the disk, so the efficiency is very high. This method has a flaw-there is no way to delete data, so Kafka will not delete data, it will retain all the data, each consumer (Consumer) has an offset for each topic to indicate Which piece of data has been read?

Regarding the performance of disk sequential read and write and random read and write, a set of official Kafka test data (Raid-5, 7200rpm) is quoted:

Sequence I/O: 600MB/s

Random I/O: 100KB/s

Therefore, by only doing Sequence I/O, Kafka has greatly improved its performance.

Zero Copy

Consider a scenario where a web program reads the content of a file and transmits it to the network. The core code implemented is as follows:

The “black technology” behind Kafka, the hottest technology for big data development

The “black technology” behind Kafka, the hottest technology for big data development

Figure 2 Ordinary read method

Although there are only two calls, it has undergone 4 copies, including 2 cpu copy, and multiple context switches between user mode and kernel mode, which will increase the burden on the cpu, and zero copy is to solve this low effect.

#Mmap:

One way to reduce the number of copies is to call mmap() instead of read():

The “black technology” behind Kafka, the hottest technology for big data development

The application program calls mmap(), the data on the disk will be copied to the kernel buffer through DMA, and then the operating system will share this kernel buffer with the application, so there is no need to copy the contents of the kernel buffer to user space . The application program then calls write(), and the operating system directly copies the contents of the kernel buffer to the socket buffer, and finally sends the data to the network card.

The “black technology” behind Kafka, the hottest technology for big data development

Figure 3 mmap method

Using mmap can reduce one cpu copy, but it will also encounter some traps. When your program maps a file, but when the file is truncated by another process, the write system call will be SIGBUS because of accessing illegal addresses. The signal is terminated. Usually, it can be solved by establishing a signal processing program for the SIGBUS signal or using file leasing, so I won’t repeat it here.

# Sendfile:

Starting from version 2.1 of the kernel, Linux introduced sendfile to simplify operations

The “black technology” behind Kafka, the hottest technology for big data development

The “black technology” behind Kafka, the hottest technology for big data development

Figure 4 sendfile method

The sendfile() method causes the DMA engine to copy the content of the file to a read buffer (DMA copy), then the kernel copies the data to the socket buffer (cpu copy) and finally to the network card (DMA copy). Using sendfile not only reduces the data copy The number of times, also reduces the context switch, data transfer always only occurs in the kernel space

Talking about this, sendfile needs at least one cpu copy, so can this step be omitted? In order to eliminate all data duplication done by the kernel, we need a network interface that supports gather operations. At the same time, in kernel version 2.4, the socket buffer descriptor has also been modified to meet the zero-copy requirements. This method not only reduces multiple context switches, but also completely cancels the cpu copy.

The “black technology” behind Kafka, the hottest technology for big data development

Figure 5 sendfile method (DMA gather)

The sendfile system call uses the DMA engine to copy the content of the file to the kernel buffer, and then adds the buffer descriptor with file location and length information to the socket buffer. This step does not copy the data in the kernel to the socket buffer. In, the DMA engine will copy the data in the kernel buffer to the protocol engine, avoiding the last CPU copy.

Zero copy technology is very common, and the transferTo and transferFrom methods of JAVA are Zero Copy.

The Links:   6MBI15LS-060 LB043WQ2-TD02