Apache Flume vs. Cloudera Distribution Hadoop (CDH)

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Apache Flume
Score 7.1 out of 10
N/A
Apache Flume is a product enabling the flow of logs and other data into a Hadoop environment.N/A
Cloudera Distribution Hadoop (CDH)
Score 3.9 out of 10
N/A
CDH is Cloudera’s 100% open source platform distribution, including Apache Hadoop and built specifically to meet enterprise demands. CDH delivers everything needed for enterprise use right out of the box. By integrating Hadoop with more than a dozen other critical open source projects, Cloudera has created a functionally advanced system that helps you perform end-to-end Big Data workflows.N/A
Pricing
Apache FlumeCloudera Distribution Hadoop (CDH)
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
Apache FlumeCloudera Distribution Hadoop (CDH)
Free Trial
NoNo
Free/Freemium Version
NoNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details——
More Pricing Information
Community Pulse
Apache FlumeCloudera Distribution Hadoop (CDH)
Top Pros
Top Cons
Best Alternatives
Apache FlumeCloudera Distribution Hadoop (CDH)
Small Businesses

No answers on this topic

No answers on this topic

Medium-sized Companies
Cloudera Manager
Cloudera Manager
Score 9.9 out of 10
Cloudera Manager
Cloudera Manager
Score 9.9 out of 10
Enterprises
IBM Analytics Engine
IBM Analytics Engine
Score 8.0 out of 10
IBM Analytics Engine
IBM Analytics Engine
Score 8.0 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
Apache FlumeCloudera Distribution Hadoop (CDH)
Likelihood to Recommend
8.0
(2 ratings)
7.0
(1 ratings)
Support Rating
5.0
(1 ratings)
-
(0 ratings)
User Testimonials
Apache FlumeCloudera Distribution Hadoop (CDH)
Likelihood to Recommend
Apache
Apache Flume is well suited when the use case is log data ingestion and aggregate only, for example for compliance of configuration management. It is not well suited where you need a general-purpose real-time data ingestion pipeline that can receive log data and other forms of data streams (eg IoT, messages).
Read full review
Cloudera
Cloudera Distribution Hadoop (CDH) does a lot of things really well - especially on the analytical front. That being said the product is quite expensive. There are seemingly numerous applications that do the same thing on the functional level that are much more cost effecient for enterprise teams. If I were recommending this to a colleague I would let them know the product will absolutely be able to get the job done for their use case, but there are more efficient options
Read full review
Pros
Apache
  • Multiple sources of data (sources) and destinations (sinks) that allows you to move data form and to any relevant data storage
  • It is very easy to setup and run
  • Very open to personalization, you can create filters, enrichment, new sources and destinations
Read full review
Cloudera
  • Solid and robust set of integrations
  • Easy to use and easy to deploy across the enterprise
  • Reliability - never lost any info
  • Simple and clean interface
Read full review
Cons
Apache
  • It is very specific for log data ingestion so it is pretty hard to use for anything else besides log data
  • Data replication is not built in and needs to be added on top of Apache Flume (not a hard job to do though)
Read full review
Cloudera
  • The price is quite high competitively speaking
  • Hard to learn more robust functions and custom options without experience
Read full review
Support Rating
Apache
Apache Flume is open-source so support is limited. Never the less, it has great documentation and best practices documents from their end-users so it is not hard to use, setup and configure.
Read full review
Cloudera
No answers on this topic
Alternatives Considered
Apache
Apache Flume is a very good solution when your project is not very complex at transformation and enrichment, and good if you have an external management suite like Cloudera, Hortonworks, etc. But it is not a real EAI or ETL like AB Initio or Attunity so
you need to know exactly what you want. On the other hand being an opensource project give Apache a lot of room to personalize thanks to its plug-able architecture and has a very nice performance having a very low CPU and Memory footprint, a single server can do the job on many occasions, as opposed to the multi-server architecture of paid products.
Read full review
Cloudera
In terms of functionality there's not much difference, both get the job done. Amazon was more cost-efficient for our team, but this could vary depending on the size of the business. One thing I did notice was that Cloudera seemed to management and spit out our deployments faster than AWS.
Read full review
Return on Investment
Apache
  • Flume has simplified a lot many of our ingest procedures, easier to deploy and integrate than a classical EAI, reducing the time to market
  • But opposed to EAIs if the project starts to grow in complexity Apache Flume project may not be as suitable
Read full review
Cloudera
  • Saves time by automating typically manual processes (data management, lifecyle AI etc)
  • Quick deployments and analytics allow for faster time-to-value
Read full review
ScreenShots