Read all reviews
Apache Kafka is the most powerful and scalable streaming framework on the market. We have used Apache Kafka as a part of many real-time …
Kafka is an event streaming platform and this is exactly the purpose we use it for in our company. Application data-in-transit goes into …
We use Kafka as the queuing mechanism for records in an indexing pipeline. Previous to using Kafka we were working with tables in SQL …
My application was dependent on other applications to generate data and those data were needed to be processed immediately. And, processed …
Kafka is being used for our IoT data flows as the middle layer to transport data and make it available for consumption. We are …
It is being used for the product mainly. We have huge data pipelines running which depend on Apache Kafka. It is being used for more than …
Kafka is being used for sending log information in real time and there[fore] can monitor apps and send these events to feed other apps. …
Apache Kafka is used by our company as the "next generation" of messaging/data-streaming pipeline solutions, to replace our old legacy …
We used it for event logging. It was used for application log collection. Was used with exception tracking and with core microservices of …
Apache Kafka is used as a stream/message ingestion engine for all the customer-facing apps including some internal streams company-wide. …
We use Kafka for two key features: (1) keeping a buffer of all the incoming records that need to be stored in our data infrastructure, and …
We are using Kafka as an ingress and egress queue for data being saved into a big data system. Kafka is also being used as a queue for …
Apache Kafka is becoming the new standard for messaging at our organization. Originally we limited the use to big data environments and …
Leaving a video review helps other professionals like you evaluate products. Be the first one in your network to record a review of Apache Kafka, and make your voice heard!
Entry-level set up fee?
- No setup fee
- Free Trial
- Free/Freemium Version
- Premium Consulting / Integration Services
Would you like us to let the vendor know that you want pricing?
8 people want pricing too
Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
Companies can't remove reviews or game the system. Here's why
- Real time streaming
- Management tools
- The pub/sub model
- Quick data transfer - regardless of volume (if you have enough resources)
- Ability to transfer large amounts of data consistently (non-binary)
- The Kafka Tool is a community-made Java application that looks and feels from the past century.
- Logging can be confusing. This certainly shows when we have to do troubleshooting.
- Hybrid scenarios - pub/sub, but there are services in and outside a Kubernetes cluster. Then there are a ~3 options, but only 2 (the harder ones) are production-safe.
- Queuing of records
- Easy expansion of Topic parititions
- An abundance of options for managing and maintaining queues
- Easy expansion of cluster for growth
- A management interface would be nice
- Built in logging tools
- Every setting is configurable.
- Work seamlessly during high data load.
- Partition mechanism.
- Easy configurable.
- Zookeeper configuration.
- Front-end can be developed to configure properties.
- UI for administrative configuration.
- Message queue
- Capture data
- Make data available
- Integration between systems
- More out of the box connectors for various other system integration
- Data Pipeline
- Asynchronous processing
- Data retention for reprocessing
- Dashboards to monitor the performance
- ZooKeeper free
- Connectors for more languages
- Open source
- Performance security
- Undoubtedly, Kafka's high throughput and low latency feature are the highlights.
- Kafka can scale horizontally very well.
- The CLI and configuration details need to be worked out more in-depth. The naming convention of configuration is not so good and causing a lot of confusion. Sometimes there are too many configuration parameters to tune--requires the adopter to understand a lot of tricks like NFS entrapment, for example.
- Lack of a good monitoring solution so far
- It handles large amount of data simultaneously. Makes application scalable.
- It is able to handle real time data pipeline.
- Resistant to node failure within the cluster.
- Does not have complete set of monitoring tools.
- It does not support wild card topic selection.
- Brokers and consumer pattern reduces the performance.
- Apache Kafka is able to handle a large number of I/Os (writes) using 3-4 cheap servers.
- It scales very well over large workloads and can handle extreme-scale deployments (eg. Linkedin with 300 billion user events each day).
- The same Kafka setup can be used as a messaging bus, storage system or a log aggregator making it easy to maintain as one system feeding multiple applications.
- Apache Kafka does take some initial setup and deployment time especially if you haven't bought support from Confluent.
- It is not a full solution so for an analytics use case, you will still need something like Tibco.
- It does not have a SQL based query engine out-of-the-box so building/using analytics on top can be a lot of work. It would be great to have something already baked into Kafka out-of-the-box.
- Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
- Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
- Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
- Doesn't work well with many small topics (on the order of thousands). There is a physical limit due to file handler usage on the number of topics Kafka can have before it grinds to a halt. This is not an issue for most people but it became an issue for us, as we need to have many, many topics and so we weren't able to fully migrate to Kafka except for a few of our big queues.
- Lack of tenant isolation: if a partition on one node starts to lag on consume or publish, then all the partitions on that node will start to lag. That's what we've noticed and it's really frustrating to our customers that another customer's bad data affects them as well.
- I don't have tooo much experience here, but I hear from other engineers on my team that the CLI admin tool is a real pain to use. For example, they say the arguments have no clear naming convention so they are hard to memorize and sometime you have to pass in undocumented properties.
- Fast queuing
- Easy to set up and configure
- Easy to add and remove queues
- User interface for configuration could be a little better
- Could be a little more defined when configuring files
- Logging is a little hard to follow
- High volume/performance throughput environments
- Low latency projects
- Multiple consumers for the same data, reprocessing, long-lasting information
- Still a bit inmature, some clients have required recoding in the last few versions
- New feaures coming very fast, several upgrades a year may be required
- Not many commercial companies provide support