User Review of Hadoop
Overall Satisfaction with Hadoop
Hadoop is part of the overall Data Strategy and is mainly used as a large volume ETL platform and crunching engine for proprietary analytical and statistical models. The biggest challenge for developers/users is moving from an RDBMS query approach for accessing data to a schema on read and list processing framework. The learning curve is steep upfront, but Hive and end user tools like Datameer can help to bridge the gap. Data governance and stewardship are of key importance given the fluid nature of how data is stored and accessed.
Pros
- Gives developers and data analysts flexibility for sourcing, storing and handling large volumes of data.
- Data redundancy and tunable MapReduce parameters to ensure jobs complete in the event of hardware failure.
- Adding capacity is seamless.
Cons
- Logs that are easier to read.
- Hadoop has made it possible to implement projects that require large amounts of data from a diverse set of source systems.
- Hadoop has also taken load off the Enterprise Data Warehouse space by absorbing some of the analytics and model building work.
Comments
Please log in to join the conversation