What users are saying about
109 Ratings
230 Ratings
109 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.4 out of 101
230 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.2 out of 101

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

Hadoop

  • Less appropriate for small data sets
  • Works well for scenarios with bulk amount of data. They can surely go for Hadoop file system, having offline applications
  • It's not an instant querying software like SQL; so if your application can wait on the crunching of data, then use it
  • Not for real-time applications
Bharadwaj (Brad) Chivukula profile photo

Pros

Apache Spark

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy profile photo

Hadoop

  • HDFS is reliable and solid, and in my experience with it, there are very few problems using it
  • Enterprise support from different vendors makes it easier to 'sell' inside an enterprise
  • It provides High Scalability and Redundancy
  • Horizontal scaling and distributed architecture
Bharadwaj (Brad) Chivukula profile photo

Cons

Apache Spark

  • Documentation could be better as I usually end up going to other sites / blogs to understand the concepts better
  • More APIs are to be ported to MLlib as only very few algorithms are available at least in clustering segment
Nitin Pasumarthy profile photo

Hadoop

  • It is not suitable for real-time processing.
  • Data store in Hadoop should be in the same pattern in order to process by Map Reduce.
  • Community and support are quite limited.
Hung Vu profile photo

Likelihood to Renew

Apache Spark

No score
No answers yet
No answers on this topic

Hadoop

Hadoop 9.6
Based on 8 answers
Hadoop is organization-independent and can be used for various purposes ranging from archiving to reporting and can make use of economic, commodity hardware. There is also a lot of saving in terms of licensing costs - since most of the Hadoop ecosystem is available as open-source and is free
Bhushan Lakhe profile photo

Usability

Apache Spark

No score
No answers yet
No answers on this topic

Hadoop

Hadoop 9.0
Based on 3 answers
I found it really useful during my academic projects. Data handling for large data sets was easy with Hadoop. It used to work really fast for bigger data sets. I found it reliable.
Tushar Kulkarni profile photo

Online Training

Apache Spark

No score
No answers yet
No answers on this topic

Hadoop

Hadoop 6.1
Based on 2 answers
Hadoop is a complex topic and best suited for classrom training. Online training are a waste of time and money.
Bhushan Lakhe profile photo

Alternatives Considered

Apache Spark

vs MapRedce, it was faster and easier to manage. Especially for Machine Learning, where MapReduce is lacking. Also Apache Storm was slower and didn't scale as much as Spark does. Spark elasticity was easier to apply compared to storm and MapReduce.managing resources for Spark was easier compared to storm as well. MapReduce is slower than spark.
Anson Abraham profile photo

Hadoop

We considered using Relationship database with Oracle Database and Java applications to process our data but ended up with Hadoop despite it being almost new. However, it proved to be the correct solution, we just need a little time to get started with Hadoop and it allows it to save cost on license and EC2 cost as we configure DataNode to be on-demand or spot instance, it also provides high performance and easy to implement as Map-Reduce function is quite simple.
Hung Vu profile photo

Return on Investment

Apache Spark

  • Apache Spark has faster performance compared to MapReduce.
  • Combination of Python & Spark is the best. Shorter code, faster and efficient performance.
  • Can replace RDBMS
Kartik Chavan profile photo

Hadoop

  • Positive impact as this is the future. Abundance of tools
  • Return on Investment is high, as Big Data helps make better decisions
  • Hadoop has made it possible to implement projects that require large amounts of data from a diverse set of source systems.
Kartik Chavan profile photo

Pricing Details

Apache Spark

General

Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Hadoop

General

Free Trial
Free/Freemium Version
Yes
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Add comparison