Best Hadoop-Related Software33Hadoop1 Hive2 Enterprise3 Spark4 Analytics Solutions5 Data Platform6 Analytics Engine8 Elastic MapReduce9 Pig10 Manager13 Drill14 Sqoop15 Azure HDInsight16 Data Science Workbench17 Flume18 Enterprise19 Greenplum HD20 Data21 PureData23 Nexus24

Hadoop-Related Software

Best Hadoop-Related Software

TrustMaps are two-dimensional charts that compare products based on satisfaction ratings and research frequency by prospective buyers. Products must have 10 or more ratings to appear on this TrustMap, and those above the median line are considered Top Rated.

Hadoop-Related Software Overview

What is Hadoop Software?

Hadoop is a very unusual kind of open-source data store from the Apache Foundation. However, an entire ecosystem of products has evolved around the Hadoop data store, to the point where it has become its own technology category.

The central idea of Hadoop is that data is spread across many commodity, inexpensive servers, although there are several commercial distributions of Hadoop from Cloudera and Hortonworks who wrap services around the technology.

Unlike a traditional database, Hadoop can handle huge volumes of both structured and unstructured data including log files, streaming data, images, audio and video files. All of this data can be put into the Hadoop cluster and accessed, modified and processed in place, eliminating the need to duplicate and structure data in a traditional warehouse.

Once this huge volume of structured and unstructured data has been stored, how do you extract any value from it? Since Hadoop is not a structured database, structured query languages like SQL do not work. But Hadoop has its own data processing and query framework called MapReduce. Developers can use MapReduce to write programs that can retrieve whatever data is needed. However, MapReduce has several constraints affecting performance and a newer product like Apache Spark provides an alternative distributed computing framework, which is significantly more efficient. Similarly, products like Hive and Cloudera Impala provide a SQL-like query language, which is much easier for data analysts to learn and use.

Hadoop-Related Products

Listings (1-25 of 39)

Apache Hive

62 Ratings

Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.

Datameer Analytics Solutions

12 Ratings

Analytics that make it easy for businesses to aggregate big data, leveraging the power and scale of Hadoop.

Hortonworks Data Platform

30 Ratings

Hortonworks Data Platform (HDP) is a data analytics platform which leverages hadoop.

IBM Analytics Engine

16 Ratings

IBM BigInsights is an analytics and data visualization tool leveraging hadoop.

Amazon Elastic MapReduce

24 Ratings

Amazon Elastic MapReduce (EMR) is a web service for processing big data (hadoop).

Apache Pig

17 Ratings

Apache Pig is a programming tool for creating MapReduce programs used in Hadoop.


8 Ratings

Databricks in San Francisco offers a big data management cloud platform and cluster manager.


7 Ratings

Presto is an open source SQL query engine supported by Teradata designed to run queries on data stored in Hadoop or in traditional databases. Teradata's development of Presto followed the acquisition of Hadapt and Revelytix.

Cloudera Manager

8 Ratings

Cloudera Manager is a management application for Apache Hadoop and the enterprise data hub, from Cloudera.

Apache Drill

3 Ratings

Apache Drill is a schema-free query engine for use with NoSQL or Hadoop data or file storage systems and databases.

Apache Sqoop

4 Ratings

Apache Sqoop is a tool for use with Hadoop, used to transfer data between Apache Hadoop and other, structured data stores.

Microsoft Azure HDInsight

23 Ratings

HDInsight is an implementation of the Apache Hadoop technology stack on the Microsoft Azure cloud platform: It is based on the Hortonworks Hadoop distribution. Microsoft Azure HDInsight includes implementations of Apache Spark, HBase, Storm, Pig, Hive, Sqoop, Oozie, Ambari, etc. It also...

Cloudera Data Science Workbench

8 Ratings

Cloudera Data Science Workbench enables secure self-service data science for the enterprise. It is a collaborative environment where developers can work with a variety of libraries and frameworks.

Apache Flume

5 Ratings

Apache Flume is a product enabling the flow of logs and other data into a Hadoop environment.

Lily Enterprise

We don't have enough ratings and reviews to provide an overall score.

Lily Enterprise is a hadoop-based platform acquired by and supported by Belgian company NGDATA since original developer Outerthought's acquisition by that company.

EMC Greenplum HD

We don't have enough ratings and reviews to provide an overall score.

EMC Greenplum HD delivers hadoop.

Argyle Data

We don't have enough ratings and reviews to provide an overall score.


We don't have enough ratings and reviews to provide an overall score.

IBM PureData

We don't have enough ratings and reviews to provide an overall score.

IBM's Netezza Analytics is an analytics platform for warehousing and processing data (hadoop).

Tachyon Nexus

We don't have enough ratings and reviews to provide an overall score.


We don't have enough ratings and reviews to provide an overall score.