Overall Satisfaction with Hadoop
The company I worked at used Hadoop clusters for processing huge datasets. They had several nodes for both production and per-production nodes. It allowed distributed processing of data across several clusters with an easy to use software model. It is used by the Systems and IT department at my company.
- HDFS provides a very robust and fast data storage system.
- Hadoop works well with generic "commodity" hardware negating the need for expensive enterprise grade hardware.
- It is mostly unaffected by system and hardware failures of nodes and is self-sustained.
- While its open source nature provides a lot of benefits, there are multiple stability issues that arise due to it.
- Limited support for interactive analytics.
- Reduced costs of hardware due to support for generic hardware
- Improved time and cost of data analysis
Hadoop is a very powerful tool that can be used in almost any environment where huge scale processing of data across clusters is required. It provides multiple modules such as HDFS and MapReduce that will make managing and analyzing said data reliable and efficient. Hadoop is a new and constantly evolving tool, and hence it needs users to be on top of it all the time.