The Apache HBase project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable.
N/A
MongoDB
Score 8.9 out of 10
N/A
MongoDB is an open source document-oriented database system. It is part of the NoSQL family of database systems. Instead of storing data in tables as is done in a "classical" relational database, MongoDB stores structured data as JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.
$0.10
million reads
Oracle Data Integrator (ODI)
Score 7.6 out of 10
N/A
Oracle Data Integrator is an ELT data integrator designed with interoperability other Oracle programs. The program focuses on a high-performance capacity to support Big Data use within Oracle.
N/A
Pricing
Apache HBase
MongoDB
Oracle Data Integrator (ODI)
Editions & Modules
No answers on this topic
Shared
$0
per month
Serverless
$0.10million reads
million reads
Dedicated
$57
per month
No answers on this topic
Offerings
Pricing Offerings
HBase
MongoDB
Oracle Data Integrator (ODI)
Free Trial
No
Yes
No
Free/Freemium Version
No
Yes
No
Premium Consulting/Integration Services
No
No
No
Entry-level Setup Fee
No setup fee
No setup fee
No setup fee
Additional Details
—
Fully managed, global cloud database on AWS, Azure, and GCP
—
More Pricing Information
Community Pulse
Apache HBase
MongoDB
Oracle Data Integrator (ODI)
Considered Multiple Products
HBase
Verified User
Engineer
Chose Apache HBase
Typically, Cassandra is faster on reads and HBase is faster on writes. You use Cassandra when you want to use a website, HBase is just an overall good general use database engine. Cassandra has its own storage engine and HBase uses HDFS and all its benefits. MongoDB is …
HBase is what you should use if you want a production ready scalable, JSON friendly, key-value, NoSQL, enterprise storage option. It excels over MongoDB due to integration with the extensive Hadoop stack and all the tools, frameworks and benefits there.
Compared NoSQL databases with traditional databases for faster retrieval and consistency. As MongoDB is a NoSQL supports dynamic fields, however, query performance is bad for aggregations and added maintenance. When compared with MySQL and Teradata, it could not scale up as …
These days I use Apache Cassandra more for even more scalability, good performance under different kind of workloads, and for providing highly available systems. Apache Cassandra also has connectors for Hadoop, Spark, and Solr.
I use Cassandra more often these days for best in class performance, tunable consistency, linear scalability. In similar cases, I have used Apache HBase. But if there is a need for document store, MongoDB is the top choice.
I have used the Pentaho Data Integrator ETL tools in different projects with the SQL Server Integration Services product from the Microsoft product family. Oracle Data Integrator ETL product is efficient in projects where Oracle databases are heavily used. The end-user …
Hbase is well suited for large organizations with millions of operations performing on tables, real-time lookup of records in a table, range queries, random reads and writes and online analytics operations. Hbase cannot be replaced for traditional databases as it cannot support all the features, CPU and memory intensive. Observed increased latency when using with MapReduce job joins.
If asked by a colleague I would highly recommend MongoDB. MongoDB provides incredible flexibility and is quick and easy to set up. It also provides extensive documentation which is very useful for someone new to the tool. Though I've used it for years and still referenced the docs often. From my experience and the use cases I've worked on, I'd suggest using it anywhere that needs a fast, efficient storage space for non-relational data. If a relational database is needed then another tool would be more apt.
Oracle Data Integrator is well suited in all the situations where you need to integrate data from and to different systems/technologies/environments or to schedule some tasks. I've used it on Oracle Database (Data Warehouses or Data Marts), with great loading and transforming performances to accomplish any kind of relational task. This is true for all Oracle applications (like Hyperion Planning, Hyperion Essbase, Hyperion Financial Management, and so on). I've also used it to manage files on different operating systems, to execute procedures in various languages and to read and write data from and to non-Oracle technologies, and I can confirm that its performances have always been very good. It can become less appropriate depending on the expenses that can be afforded by the customer since its license costs are quite high.
Being a JSON language optimizes the response time of a query, you can directly build a query logic from the same service
You can install a local, database-based environment rather than the non-relational real-time bases such a firebase does not allow, the local environment is paramount since you can work without relying on the internet.
Forming collections in Mango is relatively simple, you do not need to know of query to work with it, since it has a simple graphic environment that allows you to manage databases for those who are not experts in console management.
Oracle Data Integrator nearly addresses every data issue that one can expect. Oracle Data Integrator is tightly integrated to the Oracle Suite of products. This is one of the major strengths of Oracle Data Integrator. Oracle Data Integrator is part of the Oracle Business Intelligence Applications Suite - which is highly used by various industries. This tool replaced Informatica ETL in Oracle Business Intelligence Applications Suite.
Oracle Data Integrator comes with many pre-written data packages. If one has to load data from Excel to Oracle Database, there is a package that is ready available for them - cutting down lot of effort on writing the code. Similarly, there are packages for Oracle to SQL, SQL to Oracle and all other possible combinations. Developers love this feature.
Oracle Data Integrator relies highly on the database for processing. This is actually an ELT tool rather than an ETL tool. It first loads all the data into target instance and then transforms it at the expense of database resources. This light footprint makes this tool very special.
The other major advantage of Oracle Data Integrator, like any other Oracle products, is a readily available developer pool. As all Oracle products are free to download for demo environments, many organizations prefer to play around with a product before purchasing it. Also, Oracle support and community is a big advantage compared to other vendors.
Stored procedures functionality is not available so it should be implemented.
HBase is CPU and Memory intensive with large sequential input or output access while as Map Reduce jobs are primarily input or output bound with fixed memory. HBase integrated with Map-reduce jobs will result in random latencies.
An aggregate pipeline can be a bit overwhelming as a newcomer.
There's still no real concept of joins with references/foreign keys, although the aggregate framework has a feature that is close.
Database management/dev ops can still be time-consuming if rolling your own deployments. (Thankfully there are plenty of providers like Compose or even MongoDB's own Atlas that helps take care of the nitty-gritty.
ODI does not have an intuitive user interface. It is powerful, but difficult to figure out at first. There is a significant learning curve between usability, proficiency, and mastery of the tool.
ODI contains some frustrating bugs. It is Java based and has some caching issues, often requiring you to restart the program before you see your code changes stick.
ODI does not have a strong versioning process. It is not intuitive to keep an up to date repository of versioned code packages. This can create versioning issues between environments if you do not have a strong external code versioning process.
There's really not anything else out there that I've seen comparable for my use cases. HBase has never proven me wrong. Some companies align their whole business on HBase and are moving all of their infrastructure from other database engines to HBase. It's also open source and has a very collaborative community.
I am looking forward to increasing our SaaS subscriptions such that I get to experience global replica sets, working in reads from secondaries, and what not. Can't wait to be able to exploit some of the power that the "Big Boys" use MongoDB for.
It is maturing and over time will have a good pool of resources. Each new version has addressed the issues of the previous ones. Its getting better and bigger.
NoSQL database systems such as MongoDB lack graphical interfaces by default and therefore to improve usability it is necessary to install third-party applications to see more visually the schemas and stored documents. In addition, these tools also allow us to visualize the commands to be executed for each operation.
Oracle Data Integrator (ODI) is a reliable ELT tool, supporting data loads from various heterogenous sources. It is effective both for structured as well as non structured data. Its works well for creating translations and transformation and also aids in the data quality checks when combined with an MDM solution. Troubleshooting issues can be of a challenge if it is not configured properly.
Finding support from local companies can be difficult. There were times when the local company could not find a solution and we reached a solution by getting support globally. If a good local company is found, it will overcome all your problems with its global support.
While the setup and configuration of MongoDB is pretty straight forward, having a vendor that performs automatic backups and scales the cluster automatically is very convenient. If you do not have a system administrator or DBA familiar with MongoDB on hand, it's a very good idea to use a 3rd party vendor that specializes in MongoDB hosting. The value is very well worth it over hosting it yourself since the cost is often reasonable among providers.
Cassandra os great for writes. But with large datasets, depending, not as great as HBASE. Cassandra does support parquet now. HBase still performance issues. Cassandra has use cases of being used as time series. HBase, it fails miserably. GeoSpatial data, Hbase does work to an extent. HA between the two are almost the same.
We have [measured] the speed in reading/write operations in high load and finally select the winner = MongoDBWe have [not] too much data but in case there will be 10 [times] more we need Cassandra. Cassandra's storage engine provides constant-time writes no matter how big your data set grows. For analytics, MongoDB provides a custom map/reduce implementation; Cassandra provides native Hadoop support.
I have used Trifacta Google Data Prep quite a bit. We use Google Cloud Platform across our organization. The tools are very comparable in what they offer. I would say Data Prep has a slight edge in usability and a cleaner UI, but both of the tools have comparable toolsets.
As Hbase is a noSql database, here we don't have transaction support and we cannot do many operations on the data.
Not having the feature of primary or a composite primary key is an issue as the architecture to be defined cannot be the same legacy type. Also the transaction concept is not applicable here.
The way data is printed on console is not so user-friendly. So we had to use some abstraction over HBase (eg apache phoenix) which means there is one new component to handle.
Open Source w/ reasonable support costs have a direct, positive impact on the ROI (we moved away from large, monolithic, locked in licensing models)
You do have to balance the necessary level of HA & DR with the number of servers required to scale up and scale out. Servers cost money - so DR & HR doesn't come for free (even though it's built into the architecture of MongoDB