Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
N/A
GitHub
Score 9.1 out of 10
N/A
GitHub is a platform that hosts public and private code and provides software development and collaboration tools. Features include version control, issue tracking, code review, team management, syntax highlighting, etc. Personal plans ($0-50), Organizational plans ($0-200), and Enterprise plans are available.
$4
per month per user
Kubernetes
Score 9.0 out of 10
N/A
Kubernetes is an open-source container cluster manager.
Apache Kafka is built for scale. From high throughput and real-time data streaming, it has a strong advantage over RabbitMQ with its low latency. This put Apache Kafka at the forefront as the platform of choice for large datasets messaging and ensuring scalability when data …
Verified User
Consultant
Chose Apache Kafka
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond …
We really needed to get away from using a SQL database to act as a queue for processing records, so a new solution was needed. Kafka is a leading software application initially designed for queuing messages which is essentially what we were looking for. It has a great user …
GitHub comes handy in terms of usage and capabilities, it is easy to use and quite a user friendly tools when it comes to user experience, with limited UI/UX and it has vast exposure when it comes to third party integration and being quite mature and yet evolving and popular …
I have used TortoiseSVN before. With SVN we don't have local source control if not connected to the repository. Git is decentralized and helps us in working remotely and committing to a repository when not local.This also adds the complexity of which commands work locally and …
We already had an enterprise Kubernetes 8 set up, so once we got our namespace it took me about 2 weeks to go from not knowing anything to having a self-contained jar in a container, running on Kubernetes 8. In comparison, it took me two weeks to install Java on a blank server …
Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same.
GitHub is an easy to go tool when it comes to Version Controlling, CI/CD workflows, Integration with third party softwares. It's effective for any level of CI/CD implementation you would like to. Also the the cost of product is also very competitive and affordable. As of now GitHub lacks capabilities when it comes to detailed project management in comparison to tools like Jira, but overall its value for money.
K8s should be avoided - If your application works well without being converted into microservices-based architecture & fits correctly in a VM, needs less scaling, have a fixed traffic pattern then it is better to keep away from Kubernetes. Otherwise, the operational challenges & technical expertise will add a lot to the OPEX. Also, if you're the one who thinks that containers consume fewer resources as compared to VMs then this is not true. As soon as you convert your application to a microservice-based architecture, a lot of components will add up, shooting your resource consumption even higher than VMs so, please beware. Kubernetes is a good choice - When the application needs quick scaling, is already in microservice-based architecture, has no fixed traffic pattern, most of the employees already have desired skills.
Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
Version control: GitHub provides a powerful and flexible Git-based version control system that allows teams to track changes to their code over time, collaborate on code with others, and maintain a history of their work.
Code review: GitHub's pull request system enables teams to review code changes, discuss suggestions and merge changes in a central location. This makes it easier to catch bugs and ensure that code quality remains high.
Collaboration: GitHub provides a variety of collaboration tools to help teams work together effectively, including issue tracking, project management, and wikis.
Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
Learning curve around creation of broker and topics could be simplified
Not an easy tool for beginners. Prior command-line experience is expected to get started with GitHub efficiently.
Unlike other source control platforms GitHub is a little confusing. With no proper GUI tool its hard to understand the source code version/history.
Working with larger files can be tricky. For file sizes above 100MB, GitHub expects the developer to use different commands (lfs).
While using the web version of GitHub, it has some restrictions on the number of files that can be uploaded at once. Recommended action is to use the command-line utility to add and push files into the repository.
Local development, Kubernetes does tend to be a bit complicated and unnecessary in environments where all development is done locally.
The need for add-ons, Helm is almost required when running Kubernetes. This brings a whole new tool to manage and learn before a developer can really start to use Kubernetes effectively.
Finicy configmap schemes. Kubernetes configmaps often have environment breaking hangups. The fail safes surrounding configmaps are sadly lacking.
GitHub's ease of use and continued investment into the Developer Experience have made it the de facto tool for our engineers to manage software changes. With new features that continue to come out, we have been able to consolidate several other SaaS solutions and reduce the number of tools required for each engineer to perform their job responsibilities.
The Kubernetes is going to be highly likely renewed as the technologies that will be placed on top of it are long term as of planning. There shouldn't be any last minute changes in the adoption and I do not anticipate sudden change of the core underlying technology. It is just that the slow process of technology adoption that makes it hard to switch to something else.
Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file
GitHub is a clean and modern interface. The underlying integrations make it smooth to couple tasks, projects, pull requests and other business functions together. The insights and reporting is really strong and is getting better with every release. GitHub's PR tooling is strong for being web based, i do believe a better code editor would rival having to pull merge conflicts into local IDE.
It is an eminently usable platform. However, its popularity is overshadowed by its complexity. To properly leverage the capabilities and possibilities of Kubernetes as a platform, you need to have excellent understanding of your use case, even better understanding of whether you even need Kubernetes, and if yes - be ready to invest in good engineering support for the platform itself
Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes.
There are a ton of resources and tutorials for GitHub online. The sheer number of people who use GitHub ensures that someone has the exact answer you are looking for. The docs on GitHub itself are very thorough as well. You will often find an official doc along with the hundreds of independent tutorials that answers your question, which is unusual for most online services.
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs.
While I don't have very much experience with these 2 solutions, they're two of the most popular alternatives to GitHub. Bitbucket is from Atlassian, which may make sense for a team that is already using other Atlassian tools like Jira, Confluence, and Trello, as their integration will likely be much tighter. Gitlab on the other hand has a reputation as a very capable GitHub replacement with some features that are not available on GitHub like firewall tools.
Most of the required features for any orchestration tool or framework, which is provided by Kubernetes. After understanding all modules and features of the K8S, it is the best fit for us as compared with others out there.
Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
Positive: it's scalable so we can develop small and scale for real-world scenarios
Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.
Team collaboration significantly improved as everything is clearly logged and maintained.
Maintaining a good overview of items will be delivered wrt the roadmap for example.
Knowledge management and tracking. Over time a lot of tickets, issues and comments are logged. GitHub is a great asset to go back and review why x was y.