Apache HBase: Through the Looking Glass!
Updated November 24, 2015
Apache HBase: Through the Looking Glass!
Overall Satisfaction with HBase
Apache HBase was used for mastering solutions, for creating master data sets and reconciling conflicting data coming to Apache Hadoop systems.
- Apache HBase is a widely used java based distributed NoSQL environment on Apache Hadoop.
- While there has been growing interest and efforts in in memory computing, there are investments on Apache Hadoop (or hadoop provider variants) across domains. So that is a large market.
- I worked on HBase for applications which needed to provide strong consistency and interact with Apache Hadoop.
- You could encounter issues like region is not online or NotServingException or region server going down, out of memory errors.
- As HBase works with Zookeeper, care needs to be taken it is correctly set up. Most issues pertain usually to environment setup, configuration, shared load on system or maintenance.
- The performance across workloads when evaluated against other NoSQL variants was not best in class, this is most times okay, but can be improved.
- If you use Apache HBase, and want to upgrade it for some features then you might need to do a compatibility check against your Apache Hadoop and Apache HBase versions, there are dependency to think about.
- The HBase master slave becomes the single point of failure, and may not be a preferred design.It is not highly available system.
- Last I checked it did not have well tested easy integrations with Spark, and that can help.
If you are in Apache Hadoop environment already, and your application needs to interact heavily with Hadoop system, with strong consistency Apache HBase is a natural fit.
- Faster data insights
- Better customer service