Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
N/A
Jenkins
Score 8.3 out of 10
N/A
Jenkins is an open source automation server. Jenkins provides hundreds of plugins to support building, deploying and automating any project. As an extensible automation server, Jenkins can be used as a simple CI server or turned into a continuous delivery hub for any project.
Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same.
Jenkins is a highly customizable CI/CD tool with excellent community support. One can use Jenkins to build and deploy monolith services to microservices with ease. It can handle multiple "builds" per agent simultaneously, but the process can be resource hungry, and you need some impressive specs server for that. With Jenkins, you can automate almost any task. Also, as it is an open source, we can save a load of money by not spending on enterprise CI/CD tools.
Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
Automated Builds: Jenkins is configured to monitor the version control system for new pull requests. Once a pull request is created, Jenkins automatically triggers a build process. It checks out the code, compiles it, and performs any necessary build steps specified in the configuration.
Unit Testing: Jenkins runs the suite of unit tests defined for the project. These tests verify the functionality of individual components and catch any regressions or errors. If any unit tests fail, Jenkins marks the build as unsuccessful, and the developer is notified to fix the issues.
Code Analysis: Jenkins integrates with code analysis tools like SonarQube or Checkstyle. It analyzes the code for quality, adherence to coding standards, and potential bugs or vulnerabilities. The results are reported back to the developer and the product review team for further inspection.
Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
Learning curve around creation of broker and topics could be simplified
We have a certain buy-in as we have made a lot of integrations and useful tools around jenkins, so it would cost us quite some time to change to another tool. Besides that, it is very versatile, and once you have things set up, it feels unnecessary to change tool. It is also a plus that it is open source.
Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file
Jenkins streamlines development and provides end to end automated integration and deployment. It even supports Docker and Kubernetes using which container instances can be managed effectively. It is easy to add documentation and apply role based access to files and services using Jenkins giving full control to the users. Any deviation can be easily tracked using the audit logs.
No, when we integrated this with GitHub, it becomes more easy and smart to manage and control our workforce. Our distributed workforce is now streamlined to a single bucket. All of our codes and production outputs are now automatically synced with all the workers. There are many cases when our in-house team makes changes in the release, our remote workers make another release with other environment variables. So it is better to get all of the work in control.
Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes.
As with all open source solutions, the support can be minimal and the information that you can find online can at times be misleading. Support may be one of the only real downsides to the overall software package. The user community can be helpful and is needed as the product is not the most user-friendly thing we have used.
It is worth well the time to setup Jenkins in a docker container. It is also well worth to take the time to move any "Jenkins configuration" into Jenkinsfiles and not take shortcuts.
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs.
Overall, Jenkins is the easiest platform for someone who has no experience to come in and use effectively. We can get a junior engineer into Jenkins, give them access, and point them in the right direction with minimal hand-holding. The competing products I have used (TravisCI/GitLab/Azure) provide other options but can obfuscate the process due to the lack of straightforward simplicity. In other areas (capability, power, customization), Jenkins keeps up with the competition and, in some areas, like customization, exceeds others.
Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
Positive: it's scalable so we can develop small and scale for real-world scenarios
Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.