What users are saying about
103 Ratings
219 Ratings
103 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.5 out of 101
219 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Thomas Young profile photo

Hadoop

  • Less appropriate for small data sets
  • Works well for scenarios with bulk amount of data. They can surely go for Hadoop file system, having offline applications
  • It's not an instant querying software like SQL; so if your application can wait on the crunching of data, then use it
  • Not for real-time applications
Bharadwaj (Brad) Chivukula profile photo

Pros

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy profile photo
  • Hadoop stores and processes unstructured data such as web access logs or logs of data processing very well
  • Hadoop can be effectively used for archiving; providing a very economic, fast, flexible, scalable and reliable way to store data
  • Hadoop can be used to store and process a very large amount of data very fast
Bhushan Lakhe profile photo

Cons

  • Documentation could be better as I usually end up going to other sites / blogs to understand the concepts better
  • More APIs are to be ported to MLlib as only very few algorithms are available at least in clustering segment
Nitin Pasumarthy profile photo
  • Security is a piece that's missing from Hadoop - you have to supplement security using Kerberos etc.
  • Hadoop is not easy to learn - there are various modules with little or no documentation
  • Hadoop being open-source, testing, quality control and version control are very difficult
Bhushan Lakhe profile photo

Likelihood to Renew

No score
No answers yet
No answers on this topic
Hadoop9.6
Based on 8 answers
Only a small portion of Hadoop's capabilities have been explored within our organization. Scalability is not a labor/cost intensive exercise and new workload management features of YARN are very attractive.
Andrea Krause profile photo

Usability

No score
No answers yet
No answers on this topic
Hadoop9.0
Based on 3 answers
I found it really useful during my academic projects. Data handling for large data sets was easy with Hadoop. It used to work really fast for bigger data sets. I found it reliable.
Tushar Kulkarni profile photo

Online Training

No score
No answers yet
No answers on this topic
Hadoop6.1
Based on 2 answers
Hadoop is a complex topic and best suited for classrom training. Online training are a waste of time and money.
Bhushan Lakhe profile photo

Alternatives Considered

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
No photo available
not applicable - I have not evaluated any other products
Bhushan Lakhe profile photo

Return on Investment

  • Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark.
  • Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy.
  • Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs.
No photo available
  • With our current platform (and budget) hadoop is really the only option at this time to gain access to the capacity and technologies we require.
  • So far the only real investment has been hardware and man hours, especially in the initial learning and deployment phase.
Mark Gargiulo profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Hadoop

General
Free Trial
Free/Freemium Version
Yes
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details