Introduction
VIDIZMO offers multiple broker service options to manage real-time data feeds and enable smooth communication between microservices. Kafka is one of the broker services used in VIDIZMO. Apache Kafka, commonly known as a message broker service, is a distributed event streaming platform specially designed for handling real-time data feeds.
VIDIZMO, with its complex event processing needs, finds value in the seamless integration and communication offered by Event-Driven Architecture (EDA). The incorporation of Apache Kafka, a distributed event streaming platform, enhances data processing and communication within VIDIZMO's ecosystem. By utilizing Kafka as a broker service, VIDIZMO establishes itself as a central component for inter-communication among its various services. This approach supports real-time data streaming event-driven architecture and ensures reliable communication between different system components.
To learn more about VIDIZMO architecture, refer to “Design and Architecture Overview."
Kafka in VIDIZMO: A Broker's Role
VIDIZMO leverages Kafka's powerful message brokering capabilities to enable its event-driven architecture. In VIDIZMO, the application employs a publish/subscribe architecture using the Kafka messaging system for seamless communication between its microservices. Kafka acts as the central message-routing component. When an API endpoint generates an event, it is transmitted to Kafka, where one or more subscribers await to process the incoming messages. Kafka serves as the core mechanism for message exchange, handling requests for services, responses, and exception notifications between client and server.
Requests are structured as messages within the Kafka ecosystem, utilizing its API for communication. Additionally, Kafka manages error handling in response to reported exceptions, ensuring robust reliability. VIDIZMO's support for Kafka as the messaging system offers multiple options, enhancing flexibility in handling real-time data feeds and facilitating effective communication among microservices.
Message Delivery
- Producers and Topics
The VIDIZMO web application acts as a Producer, responsible for publishing events and generating messages, such as processing tasks. These messages are categorized into logical channels known as topics. Each topic is divided into multiple partitions, which allows for horizontal scaling and fault tolerance. This partitioning mechanism ensures that messages are distributed evenly across the system, allowing for efficient processing and optimal resource utilization.
Messages in Kafka are sent to partitions in a round-robin fashion. This distribution helps in balancing the load across partitions. Kafka ensures reliability through the Partition Leader concept. Each partition has a single leader that receives messages, while replicas synchronize with it. This ensures data is preserved even if a broker fails.
- Acknowledgment
In Kafka, acknowledgments (acks) are important because they confirm that messages have been successfully delivered. VIDIZMO uses the default acknowledgment setting, ack=all, which is the most reliable configuration. If any replica is out of sync, Kafka will wait for it to synchronize before sending back the acknowledgment. This approach guarantees the highest level of data reliability and consistency.
- Batching
The producer attempts to gather all messages in batches. This improves throughput.
- Message ordering
In the VIDIZMO Kafka broker service, message sequencing within a single partition is upheld by assigning a distinct offset to each message, ensuring sequential appending within that partition. Within the system, each consumer group functions as an independent entity. When consumers are part of different consumer groups, each group receives all messages on the topic. This is because Kafka treats each consumer group as an independent subscriber.
However, two consumers are part of the same consumer group and subscribe to a topic with multiple partitions in the VIDIZMO Kafka broker service. In that case, Kafka ensures that each consumer reads from a unique set of partitions. This allocation ensures the concurrent processing of messages, optimizing system performance. Moreover, within a consumer group, no two consumers read the same message, thereby guaranteeing that each message is processed only once per group.
- Consumer Groups and Subscription
Consumer groups play a crucial role in ensuring efficient and reliable data processing when using Kafka as a broker service. The architecture ensures that the system components, acting as Consumers, subscribe to relevant topics and promptly receive messages for further processing. This subscription model facilitates real-time data processing and keeps system components updated with the latest events.
Consumers in VIDIZMO are organized into consumer groups. These groups' main purpose is to prevent the same event from being processed more than once by the same service when distributed. Within a consumer group, the partitions of each topic are divided among the consumers. This setup ensures that no partition is assigned to more than one consumer in the same group, allowing for balanced and efficient processing.
- Partition Assignment and Rebalancing
As new consumers join or existing consumers leave a consumer group, the partitions are dynamically reassigned to maintain balanced processing. This process, known as group rebalancing, ensures that each consumer is assigned a unique set of topic partitions, optimizing resource utilization and processing efficiency.
Consumers in VIDIZMO periodically send Fetch Requests to the Kafka broker to retrieve data. These requests are performed in parallel, allowing consumers to accumulate and process data efficiently. This mechanism ensures that consumers can handle large volumes of data and maintain real-time processing capabilities.
- Offset Management
When a consumer reads a message from a partition, it commits the offset of that message. If a consumer fails, its partition is reassigned to another member, which then continues reading from the last committed offset.
By default, the consumer is set to auto-commit offsets. With auto-commit enabled, message delivery is ensured in at least once mode. This ensures that no messages are lost, although it may result in duplication.
Benefits of Kafka as a Broker Service
- Real-time processing: Enables efficient workflows and immediate reactions to events.
- High Throughput: Kafka handles high-velocity and high-volume data efficiently, supporting thousands of messages per second
- Flexibility: Easily integrates new components and workflows into the architecture.
- Fault Tolerance: Kafka automatically handles broker failures by re-electing a new leader and replicating data to the new leader, minimizing downtime and data loss.
- Cost-efficiency: Provides a scalable and efficient infrastructure solution.
Shortcoming of KAFKA
- Complex Deployment: Setting up and maintaining Kafka can be challenging.
- Dependency on ZooKeeper: Kafka relies on ZooKeeper for coordination.
Considerations
When configuring Apache Kafka as a broker service in VIDIZMO, there are several considerations to ensure optimal performance, reliability, and scalability. Here are some key aspects to keep in mind:
- Ensure that the version of Apache Kafka you are using is compatible with the version of VIDIZMO.
- Configure storage settings, including the location and size of Kafka logs. Ensure that there is sufficient disk space to handle the expected message volume.
- Set up monitoring tools to track the performance and health of the Kafka cluster.
Troubleshooting
Here are some troubleshooting tips you may encounter when deploying KAFKA.
- Ensure ZooKeeper is running and accessible to Kafka brokers. Verify connection details in Kafka configurations.
- Leverage Kafka monitoring tools to check broker health, message throughput, and resource utilization to identify bottlenecks or anomalies.
To implement KAFKA in VIDIZMO as a broker service, refer to our article Configuring Kafka as a Broker Service in VIDIZMO.