The Apache HBase project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable.
N/A
SSIS
Score 7.6 out of 10
N/A
Microsoft's SQL Server Integration Services (SSIS) is a data integration solution.
N/A
Pricing
Apache HBase
SQL Server Integration Services (SSIS)
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
HBase
SSIS
Free Trial
No
No
Free/Freemium Version
No
No
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
Apache HBase
SQL Server Integration Services (SSIS)
Features
Apache HBase
SQL Server Integration Services (SSIS)
NoSQL Databases
Comparison of NoSQL Databases features of Product A and Product B
Apache HBase
7.7
5 Ratings
14% below category average
SQL Server Integration Services (SSIS)
-
Ratings
Performance
7.15 Ratings
00 Ratings
Availability
7.85 Ratings
00 Ratings
Concurrency
7.05 Ratings
00 Ratings
Security
7.85 Ratings
00 Ratings
Scalability
8.65 Ratings
00 Ratings
Data model flexibility
7.15 Ratings
00 Ratings
Deployment model flexibility
8.25 Ratings
00 Ratings
Data Source Connection
Comparison of Data Source Connection features of Product A and Product B
Apache HBase
-
Ratings
SQL Server Integration Services (SSIS)
7.0
56 Ratings
17% below category average
Connect to traditional data sources
00 Ratings
9.056 Ratings
Connecto to Big Data and NoSQL
00 Ratings
5.043 Ratings
Data Transformations
Comparison of Data Transformations features of Product A and Product B
Apache HBase
-
Ratings
SQL Server Integration Services (SSIS)
6.8
56 Ratings
18% below category average
Simple transformations
00 Ratings
9.056 Ratings
Complex transformations
00 Ratings
4.755 Ratings
Data Modeling
Comparison of Data Modeling features of Product A and Product B
Apache HBase
-
Ratings
SQL Server Integration Services (SSIS)
7.5
54 Ratings
5% below category average
Data model creation
00 Ratings
9.028 Ratings
Metadata management
00 Ratings
6.035 Ratings
Business rules and workflow
00 Ratings
7.045 Ratings
Collaboration
00 Ratings
9.040 Ratings
Testing and debugging
00 Ratings
6.351 Ratings
Data Governance
Comparison of Data Governance features of Product A and Product B
Hbase is well suited for large organizations with millions of operations performing on tables, real-time lookup of records in a table, range queries, random reads and writes and online analytics operations. Hbase cannot be replaced for traditional databases as it cannot support all the features, CPU and memory intensive. Observed increased latency when using with MapReduce job joins.
As I mentioned earlier SQL Server Integration Services is suitable if you want to manage data from different applications. It really helps in fetching the data and generating reports. Its automation make it very easy and time efficient. It works well with large database as well. But it doesn't work well with real time data, it will take some time to gather the real time data. I would not recommend using it in a real time/fast-paced environment.
Stored procedures functionality is not available so it should be implemented.
HBase is CPU and Memory intensive with large sequential input or output access while as Map Reduce jobs are primarily input or output bound with fixed memory. HBase integrated with Map-reduce jobs will result in random latencies.
Connection managers for online data sources can be tricky to configure.
Performance tuning is an art form and trialing different data flow task options can be cumbersome. SSIS can do a better job of providing performance data including historical for monitoring.
Mapping destination using OLE DB command is difficult as destination columns are unnamed.
Excel or flat file connections are limited by version and type.
There's really not anything else out there that I've seen comparable for my use cases. HBase has never proven me wrong. Some companies align their whole business on HBase and are moving all of their infrastructure from other database engines to HBase. It's also open source and has a very collaborative community.
Some features should be revised or improved, some tools (using it with Visual Studio) of the toolbox should be less schematic and somewhat more flexible. Using for example, the CSV data import is still very old-fashioned and if the data format changes it requires a bit of manual labor to accept the new data structure
SSIS is a great tool for most ETL needs. It has the 90% (or more) use cases covered and even in many of the use cases where it is not ideal SSIS can be extended via a .NET language to do the job well in a supportable way for almost any performance workload.
SQL Server Integration Services performance is dependent directly upon the resources provided to the system. In our environment, we allocated 6 nodes of 4 CPUs, 64GB each, running in parallel. Unfortunately, we had to ramp-up to such a robust environment to get the performance to where we needed it. Most of the reports are completed in a reasonable timeframe. However, in the case of slow running reports, it is often difficult if not impossible to cancel the report without killing the report instance or stopping the service.
The support, when necessary, is excellent. But beyond that, it is very rarely necessary because the user community is so large, vibrant and knowledgable, a simple Google query or forum question can answer almost everything you want to know. You can also get prewritten script tasks with a variety of functionality that saves a lot of time.
The implementation may be different in each case, it is important to properly analyze all the existing infrastructure to understand the kind of work needed, the type of software used and the compatibility between these, the features that you want to exploit, to understand what is possible and which ones require integration with third-party tools
Cassandra os great for writes. But with large datasets, depending, not as great as HBASE. Cassandra does support parquet now. HBase still performance issues. Cassandra has use cases of being used as time series. HBase, it fails miserably. GeoSpatial data, Hbase does work to an extent. HA between the two are almost the same.
I think SQL Server Integration Services is better suited for on-premises data movement and ADF is more suited for the cloud. Though ADF has more connectors, SQL Server Integration Services is more robust and has better functionality just because it has been around much longer
As Hbase is a noSql database, here we don't have transaction support and we cannot do many operations on the data.
Not having the feature of primary or a composite primary key is an issue as the architecture to be defined cannot be the same legacy type. Also the transaction concept is not applicable here.
The way data is printed on console is not so user-friendly. So we had to use some abstraction over HBase (eg apache phoenix) which means there is one new component to handle.
Without this, we would have to manually update a spreadsheet of our SQL Server inventory
We would also have poor alerting; if an instance was down we wouldn't know until it was reported by a user
We only have one other person who uses SQL Server Integration Services , he's the expert. It would fall to me without him and I would not enjoy being responsible for it.