A very generic Hadoop review
Overall Satisfaction with Hadoop
We utilize Hadoop primarily as a large data staging area for disparate corporate data. Select data is aggregated and moved downstream to a more formal data warehouse. Some data analytics is also performed directly against the Hadoop stored data. The direct analytics is done primarily with Apache Spark utilizing Scala and Python.
Pros
- No requirement for schema on write.
- Ability to scale to massive amounts of data.
- Open platform provides multiple options and customizations to fit your exact needs.
Cons
- The platform is still maturing and can be confusing to research and use. Basic tasks can still be manual and are not always user friendly.
Comments
Please log in to join the conversation