What users are saying about

Apache Spark

98 Ratings

Hadoop

211 Ratings

Apache Spark

98 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

Hadoop

211 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.
Nitin Pasumarthy profile photo

Hadoop

  • Less appropriate for small data sets
  • Works well for scenarios with bulk amount of data. They can surely go for Hadoop file system, having offline applications
  • It's not an instant querying software like SQL; so if your application can wait on the crunching of data, then use it
  • Not for real-time applications
Bharadwaj (Brad) Chivukula profile photo

Pros

  • Machine Learning.
  • Data Analysis
  • WorkFlow process (faster than MapReduce).
  • SQL connector to multiple data sources
Anson Abraham profile photo
  • Hadoop stores and processes unstructured data such as web access logs or logs of data processing very well
  • Hadoop can be effectively used for archiving; providing a very economic, fast, flexible, scalable and reliable way to store data
  • Hadoop can be used to store and process a very large amount of data very fast
Bhushan Lakhe profile photo

Cons

  • Increase the information and trainings that come with the application, especially for debugging since the process is difficult to understand.
  • It should be more attentive to users and make tutorials, to reduce the learning curve.
  • There should be more grouping algorithms.
Carla Borges profile photo
  • Security is a piece that's missing from Hadoop - you have to supplement security using Kerberos etc.
  • Hadoop is not easy to learn - there are various modules with little or no documentation
  • Hadoop being open-source, testing, quality control and version control are very difficult
Bhushan Lakhe profile photo

Likelihood to Renew

No score
No answers yet
No answers on this topic
Hadoop9.6
Based on 8 answers
Hadoop is organization-independent and can be used for various purposes ranging from archiving to reporting and can make use of economic, commodity hardware. There is also a lot of saving in terms of licensing costs - since most of the Hadoop ecosystem is available as open-source and is free
Bhushan Lakhe profile photo

Usability

No score
No answers yet
No answers on this topic
Hadoop9.0
Based on 3 answers
I found it really useful during my academic projects. Data handling for large data sets was easy with Hadoop. It used to work really fast for bigger data sets. I found it reliable.
Tushar Kulkarni profile photo

Online Training

No score
No answers yet
No answers on this topic
Hadoop6.1
Based on 2 answers
Hadoop is a complex topic and best suited for classrom training. Online training are a waste of time and money.
Bhushan Lakhe profile photo

Alternatives Considered

I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the best performance in the processing of large data that works in memory and, therefore, more processes can be downloaded on Spark than on Hadoop, despite the fact that Hadoop is also a very useful tool.
Carla Borges profile photo
As I am new to the hadoop ecosystem I have not used or evaluated any other similar products at this time. This was handed to me from a previous much older installation that was very under utilized. Our new platform will be working the new cluster much harder with jobs that run indefinitely. I'm not sure that any of the other "big data" technologies out there have as many certified components or work with such a diverse collection but as I said I am pretty new to this and so have only tertiary knowledge of competing products
Mark Gargiulo profile photo

Collaboration and Sharing

No score
No answers yet
No answers on this topic
Hadoop7.7
Based on 10 answers
This is an area where Hadoop still needs to mature when compared to other data platforms. It is getting better but it still needs work.
Pierre LaFromboise profile photo

Data Integration

No score
No answers yet
No answers on this topic
Hadoop8.7
Based on 10 answers
Hadoop's file system makes it relatively easy to write data. You can pull data into Hadoop from most traditional data sources.
Pierre LaFromboise profile photo

Return on Investment

  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS
Kartik Chavan profile photo
  • With our current platform (and budget) hadoop is really the only option at this time to gain access to the capacity and technologies we require.
  • So far the only real investment has been hardware and man hours, especially in the initial learning and deployment phase.
Mark Gargiulo profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Hadoop

General
Free Trial
Free/Freemium Version
Yes
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details