What is Kafka? Architecture Deep Dive
A deep dive into Apache Kafka architecture and inner working
Hello guys, in last few years, Apache Kafka has emerged as a robust and scalable messaging platform in event driven architecture and Microservices.
With its ability to handle high volumes of data in real-time, Kafka has become a popular choice for building data pipelines, stream processing applications, and event-driven architectures.
Kafka is also quite important from interview point of view and that’s why I shared difference between Apache Kafka, and RabbitMQ in my last article and in this article, we will deep dive into the Kafka architecture and inner workings of Apache Kafka, exploring how it stores data, manages partitions, transactions, and maintains data integrity.
What is Apache Kafka?
If you have worked in distributed system then you must have heard about Kafka, an open-source distributed event streaming platform developed by the Apache Software Foundation.
It is designed to handle high-throughput, real-time data feeds. Kafka is often used for building real-time data pipelines and streaming applications. LinkedIn was one of the first few company which used Kafka to implement their high volume messaging platform.
Here are some key concepts associated with Apache Kafka:
Topics: Data is organized and stored in Kafka topics, which are similar to a feed or a category for messages.
Producers: Producers are responsible for publishing data to Kafka topics.
Consumers: Consumers subscribe to Kafka topics and process the data published to those topics.
Brokers: Kafka runs as a cluster of servers, called brokers, which store the data and handle the distribution of data across topics.
Keep reading with a 7-day free trial
Subscribe to Javarevisited Newsletter to keep reading this post and get 7 days of free access to the full post archives.