Apache Kafka for large scale message ingestion
November 07, 2019
Apache Kafka for large scale message ingestion

Score 10 out of 10
Vetted Review
Verified User
Overall Satisfaction with Apache Kafka
Apache Kafka is used as a stream/message ingestion engine for all the customer-facing apps including some internal streams company-wide. It is used to ingest close to 2-5 million small (few bytes) messages per second that are then used for internal analytics and decision making in realtime and feed analytics backend (Tibco Spotfire).
Pros
- Apache Kafka is able to handle a large number of I/Os (writes) using 3-4 cheap servers.
- It scales very well over large workloads and can handle extreme-scale deployments (eg. Linkedin with 300 billion user events each day).
- The same Kafka setup can be used as a messaging bus, storage system or a log aggregator making it easy to maintain as one system feeding multiple applications.
Cons
- Apache Kafka does take some initial setup and deployment time especially if you haven't bought support from Confluent.
- It is not a full solution so for an analytics use case, you will still need something like Tibco.
- It does not have a SQL based query engine out-of-the-box so building/using analytics on top can be a lot of work. It would be great to have something already baked into Kafka out-of-the-box.
- Positive impact on ROI since now we can use one large deployment of Apache Kafka that can be used for multiple scenarios ( storage systems, log aggregate, messaging queue ).
- It is open-source so there are no licenses or subscription fees reducing the cost of deployment.
- Data can now be ingested and analyzed in real-time making it easy to fine-tune the customer experience and decision making for internal IT.
Confluent Cloud is still based on Apache Kafka but it has a subscription fee so, from a long term perspective, it is wiser to deploy your own Kafka instance that spans public and private cloud. Amazon Kinesis, Google Cloud Pub/Sub do not do well for a very number of messages and doesn't provide ordering guarantees as Apache Kafka or Confluent. Apache Kafka does better in scaling and availability than IBM MQ and Rabbit MQ.
Do you think Apache Kafka delivers good value for the price?
Yes
Are you happy with Apache Kafka's feature set?
Yes
Did Apache Kafka live up to sales and marketing promises?
Yes
Did implementation of Apache Kafka go as expected?
Yes
Would you buy Apache Kafka again?
Yes
Comments
Please log in to join the conversation