Item: Apache Hadoop
Rating: 10
Author: Pierre LaFromboise

Overall Satisfaction with Hadoop

Use Cases and Deployment Scope

We utilize Hadoop primarily as a large data staging area for disparate corporate data. Select data is aggregated and moved downstream to a more formal data warehouse. Some data analytics is also performed directly against the Hadoop stored data. The direct analytics is done primarily with Apache Spark utilizing Scala and Python.

Pros and Cons

Pros

No requirement for schema on write.
Ability to scale to massive amounts of data.
Open platform provides multiple options and customizations to fit your exact needs.

Cons

The platform is still maturing and can be confusing to research and use. Basic tasks can still be manual and are not always user friendly.

Likelihood to Recommend

A big data problem doesn't always mean huge volumes of data. The other V's of big data (velocity and variety) are also important factors that may lead to selecting Hadoop as a platform.

Comments

Please log in to join the conversation

A very generic Hadoop review

Modules Used