Hadoop: A Robust Big Data Platform
Updated April 11, 2022

Hadoop: A Robust Big Data Platform

Kunal Sonalkar | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Modules Used

  • Hadoop Distributed File System
  • Hadoop MapReduce

Overall Satisfaction with Hadoop

Hadoop is being used to solve big data modeling problems in our firm. The corporate analytics team uses Hadoop to perform functions like data manipulation, information retrieval, data mapping, and statistical modeling. The business problem which it solves is the limitation of CSV/Excel files to handle more than a million rows. Hadoop allows you to process big data and also has connectivity with platforms like R Studio where you can deploy mathematical models.
  • Capability to collaborate with R Studio. Most of the statistical algorithms can be deployed.
  • Handling Big Data issues like storage, information retrieval, data manipulation, etc.
  • Redundant tasks like data wrangling, data processing, and cleaning are more efficient in Hadoop as the processing times are faster.
  • Hadoop requires intensive computational platforms like a minimum of 8GB memory and i5 processor. Sometimes the hardware does become a hindrance.
  • If we can connect Hadoop to Salesforce, it would be a tremendous functionality as most CRM data comes from that channel.
  • It will be good to have some Geo Coding features if someone wants to opt for spatial data analysis using latitudes and longitudes.
  • Positive: it is powerful, and it allows you to manage your data on a very big scale.
  • Negative: since its computationally expensive, the laptops were upgraded and that was pretty heavy on financials.
  • Positive: it also has given us the power to make data-driven decisions anytime and anywhere.
Apache Spark can be considered as an alternative because of its similar capabilities around processing and storing big data. The reason we went with Hadoop was the literature available online and integration capability with platforms like R Studio. The popularity of Hadoop has helped us in debugging issues and solving problems at a faster rate.
Hadoop is very well suited for big data modeling problems in various industries like finance, insurance, healthcare, automobiles, CRM, etc. In every industry where you need data analysis in real time, Hadoop is a perfect fit in terms of storage, analysis, retrieval, and processing. It won't be a very good tool to perform ETL (Extract Transform Load) techniques though.