Item: Apache HBase
Rating: 8
Author: Anson Abraham

Overall Satisfaction with HBase

Use Cases and Deployment Scope

HBase is used as part of the company's main revenue generating platform. We're using it store data with usages of mapreduce, generates locational information for advertising business and location analytics. Storage wise, it made sense to use HBASE over Cassandra, as well as for read performance with avro data with geospatial information in the data

Pros and Cons

Pros

Excellent for read performance
Great store of file format of avro
Easy integration into mapreduce
Replication ability

Cons

Write performance
Performance support for parquet file format. supports, but performance wise still not there
API / library availability for spark, rather than creating a new library for it

Return on Investment

Negative ROI has been on hardware usage. When used frequently, we have had constant disk failures. As a result, it requires HDD replacements.
But with disk failures, HA is available, however, to a certain extent.
Large datasets helped causality issues to be mitigated.

Alternatives Considered

Cassandra

Cassandra os great for writes. But with large datasets, depending, not as great as HBASE. Cassandra does support parquet now. HBase still performance issues. Cassandra has use cases of being used as time series. HBase, it fails miserably. GeoSpatial data, Hbase does work to an extent. HA between the two are almost the same.

Other Software Used

Cassandra, Apache Solr, Elasticsearch, Apache Spark, PostgreSQL, MariaDB, Amazon DynamoDB, Azure Cosmos DB

Likelihood to Renew

Hbase is open source. So will be using it in any case. If it was made into commercial product, strong possibility of not using HBase, and would probably use something else at that point, most likely Cassandra. HBase does scale, if done correctly, and will perform if used correctly. Would reocmmend to use.

Likelihood to Recommend

It does depend on the use case scenario. It works really well if your schema doesn't really need relational features. It's really good for that. If you want to run as transactional, not a good idea. Relational analytics is not good for this, as well as edge network data. If you're using PB of data, then HBASE is best suited in this case as well.

HBase Feature Ratings

Comments

Please log in to join the conversation

HBASE!!!