Overall Satisfaction with Hadoop
We are using it within my department to process large sets of data that can't be processed in a timely fashion on a single computer or node. The various modules provided with Hadoop make it easy for us to implement map-reduce and perform parallel processing on large sets of data. We have approximately 40TB of data that we run various algorithms against as we try to use the data to solve business problems and prevent fraudulent transactions.
- Map-reduce
- Parallel processing
- Handles node failures
- HDFS: distributed file system
- More connectors
- Query optimization
- Job scheduling
- Positive: easy implementation
- Positive: ease of scalability
- Positive: ease of distributing data and workloads
- Positive: low cost
- Positive: low learning curve
Hands down, Hadoop is less expensive than the other platforms we considered. Cloudera was easier to set up but the expense ruled it out. MS-SQL didn't have the performance we saw with the Hadoop clusters and was more expensive. We considered MS-SQL mainly for its ability to support SQL queries in hopes we could leverage the existing codebase. Azure was just more expensive but again was easier to setup. In the end, cost won out because even though the competition was easier to set up, it's not like Hadoop was that much harder to setup.
Do you think Apache Hadoop delivers good value for the price?
Yes
Are you happy with Apache Hadoop's feature set?
Yes
Did Apache Hadoop live up to sales and marketing promises?
Yes
Did implementation of Apache Hadoop go as expected?
Yes
Would you buy Apache Hadoop again?
Yes