Apache Flume vs. Apache HBase

Apache Flume

Apache Flume

9 Reviews and Ratings

Apache HBase

Apache HBase

32 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Flume	Score 7.1 out of 10	N/A	Apache Flume is a product enabling the flow of logs and other data into a Hadoop environment.	N/A
HBase	Score 7.3 out of 10	N/A	The Apache HBase project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable.	N/A

Pricing

Apache Flume

Apache HBase

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Flume	HBase
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Flume	Apache HBase
Considered Both Products	Apache Flume Verified User Anonymous Chose Apache Flume Apache Flume is on par with Scribe with similar functions. Apache Kafka is a generation purpose while Apache Flume is specific to log aggregation. Google Pub/Sub and IBM MQ are costlier than Apache Flume ( open source ) and have a lot more cost associated with them. Apama … Incentivized Helpful? Juan Francisco Tavira Global Technology Centre - Middleware Chose Apache Flume Apache Flume is a very good solution when your project is not very complex at transformation and enrichment, and good if you have an external management suite like Cloudera, Hortonworks, etc. But it is not a real EAI or ETL like AB Initio or Attunity so you need to know exactly … Incentivized Helpful?	HBase RAVI MISHRA Data Engineer Chose HBase HBase is more secure. Easily scalable. HBase is for wide-column store while MongoDB is for document store. Triggers available in HBase while in Mongodb triggers are not available. Incentivized Helpful? Bharadwaj (Brad) Chivukula Director Of Engineering/Head of Reliability Engineering Chose HBase Hbase is more robust and scalable than other DBs around Incentivized Helpful? Anson Abraham Data Lord Chose HBase Cassandra os great for writes. But with large datasets, depending, not as great as HBASE. Cassandra does support parquet now. HBase still performance issues. Cassandra has use cases of being used as time series. HBase, it fails miserably. GeoSpatial data, Hbase does work … Incentivized Helpful? Vinaybabu Raghunandha Naidu Software Engineer - Big Data Platform Chose HBase Compared NoSQL databases with traditional databases for faster retrieval and consistency. As MongoDB is a NoSQL supports dynamic fields, however, query performance is bad for aggregations and added maintenance. When compared with MySql and Teradata, it could not scale up as … Incentivized Helpful? Timothy Spann Senior Solutions Engineer Chose HBase HBase is what you should use if you want a production ready scalable, JSON friendly, key-value, NoSQL, enterprise storage option. It excels over MongoDB due to integration with the extensive Hadoop stack and all the tools, frameworks and benefits there. HBase has superior … Incentivized Helpful? Verified User Anonymous Chose HBase Typically, Cassandra is faster on reads and HBase is faster on writes. You use Cassandra when you want to use a website, HBase is just an overall good general use database engine. Cassandra has its own storage engine and HBase uses HDFS and all its benefits. MongoDB is … Incentivized Helpful? Zack Riesland Data Engineer / Software Engineer (Big Data - Cloud) Chose HBase Hbase is less robust but faster. Incentivized Helpful? Rekha Joshi Staff Software Engineer Chose HBase These days I use Apache Cassandra more for even more scalability, good performance under different kind of workloads, and for providing highly available systems. Apache Cassandra also has connectors for Hadoop, Spark, and Solr. Incentivized Helpful?

Best Alternatives
	Apache Flume	Apache HBase
Small Businesses	No answers on this topic	No answers on this topic
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Azure Cosmos DB Score 9.0 out of 10
Enterprises	IBM Analytics Engine Score 7.1 out of 10	Azure Cosmos DB Score 9.0 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Flume	Apache HBase
Likelihood to Recommend	8.0 (0 ratings)	7.7 (0 ratings)
Likelihood to Renew	- (0 ratings)	7.9 (0 ratings)
Support Rating	5.0 (0 ratings)	- (0 ratings)

User Testimonials
	Apache Flume	Apache HBase
Likelihood to Recommend	Apache Flume is well suited in small batch and near real time processing projects, taking data from one point to another with local processing (I mean not external enrichment). Filtering, transforming and multiple push destinations are common grounds for Flume. It is not so nice to use if your data needs external enrichment (taking data from external databases or web services), as transactions and (micro)batches may lead to reprocessing and it relies upon the application to avoid duplicates. Incentivized Juan Francisco Tavira Global Technology Centre - Middleware Read full review	HBase is well suited for streaming ingest, fast lookups, massive datasets, data warehouse lookup tables, RDBMS replacement, MongoDB replacement, key-value store, data scans, logs, JSON storage and some binary storage. My preferred use case is for storing data points like time series or data produced by sensors. I often use HBase when I need data available immediately and I am not looking for transactions. This is a great store for really wide tables with tons of columns. It is also great if you are not sure what type of data you are going to have. It really excels at sparse data. Incentivized Timothy Spann Senior Solutions Engineer Read full review
Pros	Multiple sources of data (sources) and destinations (sinks) that allows you to move data form and to any relevant data storage It is very easy to setup and run Very open to personalization, you can create filters, enrichment, new sources and destinations Incentivized Juan Francisco Tavira Global Technology Centre - Middleware Read full review	Scalable and truly non-relational data HBase operations run in real-time on its database rather than MapReduce jobs Scales linearly to support billions of rows with millions of columns Incentivized Bharadwaj (Brad) Chivukula Director Of Engineering/Head of Reliability Engineering Read full review
Cons	It is very specific for log data ingestion so it is pretty hard to use for anything else besides log data Data replication is not built in and needs to be added on top of Apache Flume (not a hard job to do though) Incentivized Verified User Anonymous Read full review	Write performance Performance support for parquet file format. supports, but performance wise still not there API / library availability for spark, rather than creating a new library for it Incentivized Anson Abraham Data Lord Read full review
Likelihood to Renew	No answers on this topic	There's really not anything else out there that I've seen comparable for my use cases. HBase has never proven me wrong. Some companies align their whole business on HBase and are moving all of their infrastructure from other database engines to HBase. It's also open source and has a very collaborative community. Incentivized Verified User Anonymous Read full review
Support Rating	Apache Flume is open-source so support is limited. Never the less, it has great documentation and best practices documents from their end-users so it is not hard to use, setup and configure. Incentivized Verified User Anonymous Read full review	No answers on this topic
Alternatives Considered	Apache Flume is on par with Scribe with similar functions. Apache Kafka is a generation purpose while Apache Flume is specific to log aggregation. Google Pub/Sub and IBM MQ are costlier than Apache Flume ( open source ) and have a lot more cost associated with them. Apama Streaming Analytics and Tibco Steaming are more comprehensive streaming solutions than Apache Flume so for deeper performance guarantees, it is easier to use Apache Flume. Incentivized Verified User Anonymous Read full review	Compared NoSQL databases with traditional databases for faster retrieval and consistency. As MongoDB is a NoSQL supports dynamic fields, however, query performance is bad for aggregations and added maintenance. When compared with MySQL and Teradata, it could not scale up as fast as Hbase and added cost involved to it. HBase can be easily scalable to a huge volume of records, have a faster lookup and provides consistency Incentivized Vinaybabu Raghunandha Naidu Software Engineer - Big Data Platform Read full review
Return on Investment	Positive impact on ROI due to a reduction in manual labor to generate and maintain compliance reports based on logs. Positive impact on the business objective by reducing the need for provisioning compute for log aggregate IT stack in advance but adding on an as-needed basis. Incentivized Verified User Anonymous Read full review	Positive: Open source, easy to use, good to store big data. Negative: SQL functionalities are not available. More memory utilization More troubleshooting Incentivized RAVI MISHRA Data Engineer Read full review
ScreenShots