Apache Spark Reviews

114 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow noopener noreferrer'>trScore algorithm: Learn more.</a>
Score 8.4 out of 100

Do you work for this company? Manage this listing

Overall Rating

Reviewer's Company Size

Last Updated

By Topic

Industry

Department

Experience

Job Type

Role

Filtered By:

Reviews (1-15 of 15)

Yogesh Mhasde profile photo
Score 8 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

1. Apache Spark is almost 100 % faster than Hadoop.
2. Apache Spark is more stable than Amazon EMR.
3. The end to end distributed machine library is more robust in Apache Spark.
4. For very large data sets, Apache Spark is more trustworthy than the other two.
5. For data transformations, Apache Spark provides a very rich set of APIs.
6. The interface provided for SQL in Apache Spark is easy to understand as compared to others.
Read Yogesh Mhasde's full review
No photo available
Score 9 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

Databricks uses Spark as a foundation, and is also a great platform. It does bring several add-ons, which we did not feel needed by the time we evaluated - and haven't needed since then. One interesting plus in our opinion was the engineering support, which is great depending on the criticality of your platform.
Read this authenticated review
Thomas Young profile photo
Score 7 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

How does Apache Spark perform against competing tools? I think Apache Spark does well in processing large volumes of data. The machine learning models also seem to be easier to program and interpret. With that said, the programming side of Apache Spark seems more difficult to implement good models than Kinesis or other tools. You really have to have lots of data and very valuable questions to answer to justify the investment in Apache Spark.
Read Thomas Young's full review
No photo available
March 16, 2019

Apache Spark Review

Score 7 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

It is easy to learn, read and to maintain. It brings the best of the Ruby on Rails framework from Java that helps to create a web service so easily. Communication is one of the most distinctive features of Apache Spark compared to alternative products. You are able to communicate with your colleague in your team who also uses Spark while you are on the phone.
Read this authenticated review
Shiv Shivakumar profile photo
Score 9 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

We evaluated SAS alongside with Apache Spark but during the course of proof of concept found that Apache Spark was able to support the hadoop eco-system and hadoop file system much better. It was much faster at that time while having the ability to process data quickly for the business analytical needs and and also scaled up well.
Read Shiv Shivakumar's full review
Nitin Pasumarthy profile photo
Score 10 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional type programming easily based on the situation. Also it doesn't need special data ingestion or indexing pre-processing like Presto. Combining it with Jupyter Notebooks (https://github.com/jupyter-incubator/sparkmagic), one can develop the Spark code in an interactive manner in Scala or Python.
Read Nitin Pasumarthy's full review
Carla Borges profile photo
Score 10 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the best performance in the processing of large data that works in memory and, therefore, more processes can be downloaded on Spark than on Hadoop, despite the fact that Hadoop is also a very useful tool.
Read Carla Borges's full review
Anson Abraham profile photo
Score 9 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

  • mapreduce and apache storm
vs MapRedce, it was faster and easier to manage. Especially for Machine Learning, where MapReduce is lacking. Also Apache Storm was slower and didn't scale as much as Spark does. Spark elasticity was easier to apply compared to storm and MapReduce.
managing resources for Spark was easier compared to storm as well. MapReduce is slower than spark.
Read Anson Abraham's full review
Kartik Chavan profile photo
Score 9 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

Even with Python, MapReduce is lengthy coding. Combination of Python with Apache Spark will not only shorten the code, but it will effectively increase the speed of algorithms. Occasionally, I use MapReduce, but Apache Spark will replace MapReduce very soon. It has many built-in and faster features.
Read Kartik Chavan's full review
Kamesh Emani profile photo
Score 10 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

Apache Pig and Apache Hive provide most of the things spark provide but apache spark has more features like actions and transformations which are easy to code. Spark uses optimization technique as we can select driver program and manipulate DAG (Directed Acyclic Graph)
Python can be used even for data transformations but it requires lot of coding compared to Spark and it is even so slow.
Read Kamesh Emani's full review
Jordan Moore profile photo
Score 8 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

Spark has primarily replaced my use of writing pure Hadoop MapReduce or Apache Pig jobs for processing data. I like the fact that I can alternate between the main programming languages that I know - Java and Python - and use those to learn the Scala API. Spark also can be installed individually on any computer, and one can quickly get started writing applications using just the Spark Shell. I also enjoy the features that you can easily add community built packages into a Spark application such as connectors to different database sources or have various data processing libraries that aren't included in the programming language that is used.
Read Jordan Moore's full review
No photo available
Score 9 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Read this authenticated review
No photo available
Score 10 out of 10
Vetted Review
Verified User
Review Source

Alternatives Considered

There are a few newer frameworks for general processing like Flink, Beam, frameworks for streaming like Samza and Storm, and traditional Map-Reduce. I think Spark is at a sweet spot where its clearly better than Map-Reduce for many workflows yet has gotten a good amount of support in the community that there is little risk in deploying it. It also integrates batch and streaming workflows and APIs, allowing an all in package for multiple use-cases.
Read this authenticated review

About Apache Spark

Categories:  Hadoop-Related

Apache Spark Technical Details

Operating Systems: Unspecified
Mobile Application:No