Hadoop

Hadoop

Score 8.5 out of 10
Apache Hadoop

Overview

What is Hadoop?

Hadoop is an open source software from Apache, supporting distributed processing and data storage. Hadoop is popular for its scalability, reliability, and functionality available across commoditized hardware.
Read more

Recent Reviews

Hadoop Review

7 out of 10
May 16, 2018
It is massively being used in our organization for data storage, data backup, and machine learning analytics. Managing vast amounts of …
Continue reading
Read all reviews

Reviewer Pros & Cons

View all pros & cons

Video Reviews

Leaving a video review helps other professionals like you evaluate products. Be the first one in your network to record a review of Hadoop, and make your voice heard!

Return to navigation

Product Demos

What is Hadoop?
03:07
Return to navigation

Product Details

What is Hadoop?

Hadoop is an open source software from Apache, supporting distributed processing and data storage. Hadoop is popular for its scalability, reliability, and functionality available across commoditized hardware.

Hadoop Video

What is Hadoop?

Hadoop Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo

Frequently Asked Questions

Hadoop is an open source software from Apache, supporting distributed processing and data storage. Hadoop is popular for its scalability, reliability, and functionality available across commoditized hardware.

Reviewers rate Data Sources highest, with a score of 8.7.

The most common users of Hadoop are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews

(1-3 of 3)
Companies can't remove reviews or game the system. Here's why
Tom Thomas | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User
The company I worked at used Hadoop clusters for processing huge datasets. They had several nodes for both production and per-production nodes. It allowed distributed processing of data across several clusters with an easy to use software model. It is used by the Systems and IT department at my company.
  • HDFS provides a very robust and fast data storage system.
  • Hadoop works well with generic "commodity" hardware negating the need for expensive enterprise grade hardware.
  • It is mostly unaffected by system and hardware failures of nodes and is self-sustained.
  • While its open source nature provides a lot of benefits, there are multiple stability issues that arise due to it.
  • Limited support for interactive analytics.
Hadoop is a very powerful tool that can be used in almost any environment where huge scale processing of data across clusters is required. It provides multiple modules such as HDFS and MapReduce that will make managing and analyzing said data reliable and efficient. Hadoop is a new and constantly evolving tool, and hence it needs users to be on top of it all the time.
  • Reduced costs of hardware due to support for generic hardware
  • Improved time and cost of data analysis
No
  • Price
  • Product Features
  • Product Usability
  • Product Reputation
  • Vendor Reputation
  • Analyst Reports
  • Third-party Reviews
Tushar Kulkarni | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User
I have been working with Hadoop since last year. It is very user friendly. Hadoop was used by the data center management team. It allows distributed processing of huge amount of data sets across clusters of computers using simple programming models.
  • It is robust in the sense that any big data applications will continue to run even when individual servers fail.
  • Enormous data can be easily sorted.
  • It can be improved in terms of security.
  • Since it is open source, stability issues must be improved.
Hadoop is really very useful when dealing with big data.
Apache Spark has an in memory processing model, making it powerful for lightning fast data processing. Apache Spark also exposes Scala and Python in APIs which is one of the most commonly used programming languages in data analytic and data processing domains.
No
  • Product Features
  • Product Usability
I used hadoop and found it really useful while working with bigger data sets. I used Hadoop for my project to get insight of different patterns from given data set. It was easy and user friendly.
I'll be looking at scalability, reliability. At the same time it will be good to have small learning curve.
  • Processing huge data sets with good performance
  • Distributed data handling with multiple nodes
  • Small Learning curve
  • Using Hdoop is a heavy weight process
  • Installation is a little tricky for newbees
  • Not suitable for dynamic data sets
Yes, but I don't use it
I found it really useful during my academic projects. Data handling for large data sets was easy with Hadoop. It used to work really fast for bigger data sets. I found it reliable.
Bhushan Lakhe | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User
Hadoop is used for storing and analyzing log data (logs from warehouse loads or other data processing) as well as storing and retrieving financial data from JD Edwards. It's also planned to be used for archival. Hadoop is used by several departments within our organization. Currently, we are paying a lot of money for hosting historical data and we plan to move that to Hadoop; reducing our storage costs. Also, we got a much better performance out of our Hadoop cluster for processing a large amount of financial data. So, in that senese, Hadoop addressed multiple business problems for us.
  • Hadoop stores and processes unstructured data such as web access logs or logs of data processing very well
  • Hadoop can be effectively used for archiving; providing a very economic, fast, flexible, scalable and reliable way to store data
  • Hadoop can be used to store and process a very large amount of data very fast
  • Security is a piece that's missing from Hadoop - you have to supplement security using Kerberos etc.
  • Hadoop is not easy to learn - there are various modules with little or no documentation
  • Hadoop being open-source, testing, quality control and version control are very difficult
Hadoop is best suited for warehouse or OLAP processing. It's not suitable for OLTP or small transaction processing
  • We had a large ROI due to improved performance and expedited reporting - our clients were happier and business improved
  • Our storage costs reduced
  • Our infrastructure costs reduced - we used old hardware for our Hadoop cluster
not applicable - I have not evaluated any other products
50
Various - IT, business users, vendors
3
Hadoop Administrator, Java Developer, Hive deveoper
  • Use of HDFS / Hive for storage / analysis of data processing logs
  • Use of HDFS / Hive for storage / analysis of historical financial data
  • Use of HDFS for Archival
  • Archival
  • Reporting
  • ETL
  • Data transfer
  • Staging area
  • Historical reporting
Hadoop is organization-independent and can be used for various purposes ranging from archiving to reporting and can make use of economic, commodity hardware. There is also a lot of saving in terms of licensing costs - since most of the Hadoop ecosystem is available as open-source and is free
Yes
We replaced 5 Windows based servers by a 10 node CentOS based desktops. Saved a lot on hardware and Windows server licenses
  • Price
  • Product Features
  • Product Usability
Price. We saved a lot of money
I will evaluate the ROI more closely
Hadoop is a complex topic and best suited for classrom training. Online training are a waste of time and money.
Return to navigation