Apache Spark vs. SAS Enterprise Guide

Apache Spark

Apache Spark

165 Reviews and Ratings

SAS Enterprise Guide

SAS Enterprise Guide

30 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Spark	Score 8.9 out of 10	N/A	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.	N/A
SAS Enterprise Guide	Score 9.3 out of 10	N/A	SAS Enterprise Guide is a menu-driven, Windows GUI tool for SAS.	N/A

Pricing

Apache Spark

SAS Enterprise Guide

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Spark	SAS Enterprise Guide
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Spark	SAS Enterprise Guide
Considered Both Products	Apache Spark Ananth Gouri Assistant Professor Chose Apache Spark We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only. Incentivized Helpful? Riyaz Khan Staff Engineer Chose Apache Spark Apache Spark is a fast-processing in-memory computing framework. It is 10 times faster than Apache Hadoop. Earlier we were using Apache Hadoop for processing data on the disk but now we are shifted to Apache Spark because of its in-memory computation capability. Also in SAP … Incentivized Helpful? Steven Li Senior Software Developer (Consultant) Chose Apache Spark Other teams used to work on Apache Hadoop but our team started with Apache Spark directly. Incentivized Helpful? Verified User Anonymous Chose Apache Spark There are a few alternatives that can do the same transformation and aggregation like Apache Spark can do but most of them are not able to perform parallel computation. For example, pandas is a really good tool to do that but not parallelized; However, there are some tools that … Incentivized Helpful? Surendranatha Reddy Chappidi Senior Data Engineer Chose Apache Spark Apache Spark works in distributed mode using cluster Informatica and Datastage cannot scale horizontally We can write custom code in spark, whereas in Datastage and Informatica we can only choose the different features proivided already. Incentivized Helpful? Verified User Anonymous Chose Apache Spark Apache Spark has much more better performance and features if we compare with Hive or map/reduce kind of solutions. Spark has many other features for machine learning, streaming. Incentivized Helpful? Chetan Munegowda Software Engineer Chose Apache Spark Spark is simply awesome to work on with any data sets and also has an in-memory database which makes it very flexible. Incentivized Helpful? YM Yogesh Mhasde Technical Manager Chose Apache Spark 1. Apache Spark is almost 100 % faster than Hadoop. 2. Apache Spark is more stable than Amazon EMR. 3. The end to end distributed machine library is more robust in Apache Spark. Incentivized Helpful? Verified User Anonymous Chose Apache Spark Databricks uses Spark as a foundation, and is also a great platform. It does bring several add-ons, which we did not feel needed by the time we evaluated - and haven't needed since then. One interesting plus in our opinion was the engineering support, which is great depending … Incentivized Helpful? Verified User Anonymous Chose Apache Spark It is easy to learn, read and to maintain. It brings the best of the Ruby on Rails framework from Java that helps to create a web service so easily. Communication is one of the most distinctive features of Apache Spark compared to alternative products. You are able to … Incentivized Helpful? SS Shiv Shivakumar Acquisitions Leader Chose Apache Spark We evaluated SAS alongside with Apache Spark but during the course of proof of concept found that Apache Spark was able to support the hadoop eco-system and hadoop file system much better. It was much faster at that time while having the ability to process data quickly for the … Incentivized Helpful? Carla Borges Consultor Tecnico - Java Developer and Php Developer. Chose Apache Spark I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the … Incentivized Helpful? Nitin Pasumarthy Software Engineer Chose Apache Spark All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional … Incentivized Helpful? Kartik Chavan Data Analyst Chose Apache Spark Even with Python, MapReduce is lengthy coding. Combination of Python with Apache Spark will not only shorten the code, but it will effectively increase the speed of algorithms. Occasionally, I use MapReduce, but Apache Spark will replace MapReduce very soon. It has many … Incentivized Helpful? Anson Abraham Data Czar Chose Apache Spark vs MapRedce, it was faster and easier to manage. Especially for Machine Learning, where MapReduce is lacking. Also Apache Storm was slower and didn't scale as much as Spark does. Spark elasticity was easier to apply compared to storm and MapReduce. managing resources for … Incentivized Helpful? Verified User Anonymous Chose Apache Spark We specifically choose Spark over MapReduce to make the cluster processing faster Incentivized Helpful? Verified User Anonymous Chose Apache Spark Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and … Incentivized Helpful? Kamesh Emani Software Developer Intern Chose Apache Spark Apache Pig and Apache Hive provide most of the things spark provide but apache spark has more features like actions and transformations which are easy to code. Spark uses optimization technique as we can select driver program and manipulate DAG (Directed Acyclic Graph) Python … Incentivized Helpful? Verified User Anonymous Chose Apache Spark There are a few newer frameworks for general processing like Flink, Beam, frameworks for streaming like Samza and Storm, and traditional Map-Reduce. I think Spark is at a sweet spot where its clearly better than Map-Reduce for many workflows yet has gotten a good amount of … Incentivized Helpful? Jordan Moore Staff Consultant Chose Apache Spark Spark has primarily replaced my use of writing pure Hadoop MapReduce or Apache Pig jobs for processing data. I like the fact that I can alternate between the main programming languages that I know - Java and Python - and use those to learn the Scala API. Spark also can be … Incentivized Helpful?	SAS Enterprise Guide Ben Holmes Senior Data Scientist Chose SAS Enterprise Guide Python-based platforms like Pandas or Spark are very good too at displaying data and do exploratory analysis. I definitely prefer them to SAS EG. It's just too slow, and doesn't let you peek into the data very easily. Lots of clicking, and I'd rather just write some code, … Incentivized Helpful? Verified User Anonymous Chose SAS Enterprise Guide This was used by the unit before I joined. It was compared to SPSS but I was not included in that discussion. Incentivized Helpful? Verified User Anonymous Chose SAS Enterprise Guide Although not used in the enterprise, I have used Anaconda Python to shape and cleanse data from Excel reports that was too difficult for SAS to complete. The object oriented nature and the Pandas package made ingestion of the data and reshaping more useful in this use case. … Incentivized Helpful? Verified User Anonymous Chose SAS Enterprise Guide SAS EG has better Graphical User Interface to build project trees and help users to create data queries/calculations. SAS EG can handle bigger data sets compared to other programs. You can easily clean the data sets and manipulate the data. It is easier to send the project tree … Incentivized Helpful? Akshaya Bhardwaj Consultant Chose SAS Enterprise Guide Why I prefer SAS EG: Data processing speed is much faster than that R Studio. It can load any amount of data and any type of data like structured or unstructured or semi-structured. Its output delivery system by which we have the output in PDF file makes it very comfortable to … Incentivized Helpful? Verified User Anonymous Chose SAS Enterprise Guide It gives more flexibility in terms of writing codes, and you're able too see expected output and then you go on to modify Incentivized Helpful? Rohit Narang Assistant Vice President Chose SAS Enterprise Guide Tableau : A good tool for visualisations but SAS is better for running production scripts & using adhoc analysis Incentivized Helpful? Mathieu Gaouette Consultant BI senior - spécialité SAS Chose SAS Enterprise Guide I haven't used SPSS myself but from what I was told, integration of data was much more limited and not easy to used. Also, the number of people with SPSS knowledge is less than the number of SAS users so finding workforce can be an issue. The whole SAS solution just made much … Incentivized Helpful?

Best Alternatives
	Apache Spark	SAS Enterprise Guide
Small Businesses	No answers on this topic	IBM SPSS Statistics Score 8.0 out of 10
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Posit Score 10.0 out of 10
Enterprises	IBM Analytics Engine Score 8.6 out of 10	Posit Score 10.0 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Spark	SAS Enterprise Guide
Likelihood to Recommend	9.0 (0 ratings)	5.3 (0 ratings)
Likelihood to Renew	10.0 (0 ratings)	8.0 (0 ratings)
Usability	8.0 (0 ratings)	5.0 (0 ratings)
Support Rating	8.7 (0 ratings)	5.3 (0 ratings)
Implementation Rating	- (0 ratings)	7.0 (0 ratings)

User Testimonials
	Apache Spark	SAS Enterprise Guide
Likelihood to Recommend	Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries. Incentivized Nitin Pasumarthy Software Engineer Read full review	For writing out longer code creation for shaping data on complicated reports, the clean UI is helpful. If exploring data though, SAS Studio would be better suited given its easier interface for GUI graph building. Incentivized Verified User Anonymous Read full review
Pros	It performs a conventional disk-based process when the data sets are too large to fit into memory, which is very useful because, regardless of the size of the data, it is always possible to store them. It has great speed and ability to join multiple types of databases and run different types of analysis applications. This functionality is super useful as it reduces work times Apache Spark uses the data storage model of Hadoop and can be integrated with other big data frameworks such as HBase, MongoDB, and Cassandra. This is very useful because it is compatible with multiple frameworks that the company has, and thus allows us to unify all the processes. Incentivized Carla Borges Consultor Tecnico - Java Developer and Php Developer. Read full review	It can load a huge amount of data as compared to R Studio and Excel. Data processing speed is very fast, millions of records are loaded into this software very easily and data manipulation is also very easy. Inbuilt Statistical functions and procedures make it very comfortable to use for non analytics professionals as well. Akshaya Bhardwaj Consultant Read full review
Cons	Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Incentivized Anson Abraham Data Czar Read full review	I would like to see advance interactions with external databases to be able to kill ongoing queries from SAS. As of now, you can stop pretty much any ongoing process besides the one running on a remote database (killing SAS/EG doesn't stop the remote process) When creating prompts for programs, it would be nice to be able to have conditional prompts (based on the selection of other prompts). The prompts are clearly a recent feature and constantly under development but I wish it would be more powerful. More of a SAS metadata issue but when loading SAS/EG (first connection to the server), it takes a few seconds which feels like a long time. I really don't understand why the initialization of the session can take so long. Don't get me wrong, this has no real impact on productivity but that 10s delay just feels really like eternity when you want to run some code in a new session. Mathieu Gaouette Consultant BI senior - spécialité SAS Read full review
Likelihood to Renew	Capacity of computing data in cluster and fast speed. Steven Li Senior Software Developer (Consultant) Read full review	On account of current user experience and the organization-wide acceptance. Rohit Narang Assistant Vice President Read full review
Usability	If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used Incentivized Verified User Anonymous Read full review	It's not all bad, but I don't believe that an enterprise purchase of SAS is worth the expense considering the widely available set of tools in the data analytics space at the moment. In my company, it's a good tool because others use it. Otherwise, I wouldn't purchase a new set of it because it doesn't have some of the better analytical functions in it. Incentivized Ben Holmes Senior Data Scientist Read full review
Support Rating	1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications. YM Yogesh Mhasde Technical Manager Read full review	Although I use SAS support for information on functions, these are SAS related and haven't really come across anything that is specifically for SAS EG. Incentivized Verified User Anonymous Read full review
Implementation Rating	No answers on this topic	I've not worked hands-on with the implementation team, but there were no escalations barring a few hiccups in the deployment due to change in requirement & adoption to our company's remote servers. Rohit Narang Assistant Vice President Read full review
Alternatives Considered	We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only Incentivized Ananth Gouri Assistant Professor Read full review	Python-based platforms like Pandas or Spark are very good too at displaying data and do exploratory analysis. I definitely prefer them to SAS EG. It's just too slow, and doesn't let you peek into the data very easily. Lots of clicking, and I'd rather just write some code, rather do clicking. Incentivized Ben Holmes Senior Data Scientist Read full review
Return on Investment	Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark. Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy. Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs. Incentivized Verified User Anonymous Read full review	Faster decision making, through powerful big data handling functionalities. Faster operations on daily basis, once the project tree is built, unskilled personnel can use it in their daily operation. Don’t need to choose SAS EG if you are not going to be handling big data. (such as over 1 million rows and 50 columns) You need skilled personnel to build the initial project tree. Incentivized Verified User Anonymous Read full review
ScreenShots