February 16, 2016

Apache Hadoop is the best open source product I used.

Piyush Routray | TrustRadius Reviewer
Score 9 out of 10
Modules Used

  • Hadoop Distributed File System
  • Hadoop MapReduce

Overall Satisfaction with Hadoop

My present company uses Hadoop and associated technology to create a data pipeline using open source tools. Apart from that we also consult for projects which could potentially use Hadoop. Apart from that, I also work as a consultant for HDP. We actively help in installation and setup of hadoop clusters.
  • Hadoop is open source and with a wide community already present, the usage is much easy for individuals, startups and MNCs alike.
  • Hadoop works well for commodity hardware and that makes it easier to avoid pricey clusters.
  • Hadoop takes parallel programming to next level and helps processing of multi terabytes (even petabytes) of data easier.
  • While Hadoop MR parallelizes jobs involving Big Data, it is slow for smaller data sets
  • OLAP (analytics)is easier, however, OLTP (transactions) is a problem in most cases.
  • People using Hadoop have to keep in mind that small proof of concepts may not scale as expected.
Hadoop being open source, is cheaper to use and do POCs for clients. Cloudera, Hortonworks and MapR also compete to contribute to open source Hadoop and keep their product conceptually similar to Hadoop.
Hadoop is well suited only if you have large datasets to work upon. Jumping to Hadoop with small data sets won't be as useful.