Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
N/A
SAS Data Management
Score 8.0 out of 10
N/A
A suite of solutions for data connectivity, enhanced transformations and robust governance. Solutions provide a unified view of data with access to data across databases, data warehouses and data lakes. Connects with cloud platforms, on-premises systems and multicloud data sources.
N/A
Pricing
Apache Kafka
SAS Data Management
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
Apache Kafka
SAS Data Management
Free Trial
No
No
Free/Freemium Version
No
No
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
Apache Kafka
SAS Data Management
Features
Apache Kafka
SAS Data Management
Data Source Connection
Comparison of Data Source Connection features of Product A and Product B
Apache Kafka
-
Ratings
SAS Data Management
8.3
10 Ratings
1% below category average
Connect to traditional data sources
00 Ratings
8.610 Ratings
Connecto to Big Data and NoSQL
00 Ratings
8.19 Ratings
Data Transformations
Comparison of Data Transformations features of Product A and Product B
Apache Kafka
-
Ratings
SAS Data Management
6.7
8 Ratings
20% below category average
Simple transformations
00 Ratings
6.18 Ratings
Complex transformations
00 Ratings
7.48 Ratings
Data Modeling
Comparison of Data Modeling features of Product A and Product B
Apache Kafka
-
Ratings
SAS Data Management
6.7
8 Ratings
17% below category average
Data model creation
00 Ratings
5.56 Ratings
Metadata management
00 Ratings
7.47 Ratings
Business rules and workflow
00 Ratings
6.67 Ratings
Collaboration
00 Ratings
7.07 Ratings
Testing and debugging
00 Ratings
6.17 Ratings
Data Governance
Comparison of Data Governance features of Product A and Product B
Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same.
When data is in a system that needs a complex transformation to be usable for an average user. Such tasks as data residing in systems that have very different connection speeds. It can be integrated and used together after passing through the SAS Data Integration Studio removing timing issues from the users' worries. A part that is perhaps less appropriate is getting users who are not familiar with the source data to set up the load processes.
Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
SAS/Access is great for manipulating large and complex databases.
SAS/Access makes it easy to format reports and graphics from your data.
Data Management and data storage using the Hadoop environment in SAS/Access allows for rapid analysis and simple programming language for all your data needs.
Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
Learning curve around creation of broker and topics could be simplified
Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file
The main negative point is the use of a non-standard language for customizations, as well as the poor integration with non-SAS systems. However, there is no doubt that it is a high-performance and powerful product capable of responding optimally to certain requirements.
Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes.
With SAS, you pay a license fee annually to use this product. Support is incredible. You get what you pay for, whether it's SAS forums on the SAS support site, technical support tickets via email or phone calls, or example documentation. It's not open source. It's documented thoroughly, and it works.
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs.
Because of ease of using SAS DI and data processing speed. There were lots of issues with AWS Redshift on cloud environment in terms of making connections with the data sources and while fetching the data we need to write complex queries.
Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
Positive: it's scalable so we can develop small and scale for real-world scenarios
Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.