Hadoop is pretty Badass
January 04, 2018

Hadoop is pretty Badass

Anonymous | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Modules Used

  • Hadoop Distributed File System
  • Hadoop MapReduce

Overall Satisfaction with Hadoop

Apache Hadoop is a cost effective solution for storing and managing vast amounts of data efficiently. It is dependable and works even when various clusters fail. The Hadoop Distributed File System (HDFS) also goes a long way in helping in storing data. MapReduce and Tez, with the help of Hive of course, processes large amounts of data in a lesser time frame than expected. This helps our data warehouse to be updated with lesser resources rather than reading, processing and updating data in a relational data base.
  • It is cost effective.
  • It is highly scalable.
  • Failure tolerant.
  • Hadoop does not fit all needs.
  • Converting data into a single format takes time.
  • Need to take additional security measures to secure data.
  • It has made us respect the sheer volume of Big data.
  • It helps us update our dashboards.
  • It's easier to think big with Hadoop.
When we have data coming in from various sources, using hadoop is a good call. Its a good central station to take a good look at your data and see what needs to be done.
Hadoop should not be used directly for Real time Analytics. HDFS should be used to store data and we could use Hive to query the files.
Hadoop needs to be understood thoroughly even before attempting to use it for data warehousing needs. So you may need to take stock of what Hadoop provides, and read up on its accompanying tools to see what fits your needs.