Apache Spark vs. IBM SPSS Statistics

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Apache Spark
Score 8.7 out of 10
N/A
N/AN/A
IBM SPSS Statistics
Score 8.4 out of 10
N/A
SPSS Statistics is a software package used for statistical analysis. It is now officially named "IBM SPSS Statistics". Companion products in the same family are used for survey authoring and deployment (IBM SPSS Data Collection), data mining (IBM SPSS Modeler), text analytics, and collaboration and deployment (batch and automated scoring services).
$99
per month
Pricing
Apache SparkIBM SPSS Statistics
Editions & Modules
No answers on this topic
Subscription
$99.00
per month
Base
$3,610
one-time fee per user
Standard
$7,960
one-time fee per user
Professional
$15,900
one-time fee per user
Premium
$23,800
one-time fee per user
Offerings
Pricing Offerings
Apache SparkIBM SPSS Statistics
Free Trial
NoYes
Free/Freemium Version
NoNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details
More Pricing Information
Community Pulse
Apache SparkIBM SPSS Statistics
Top Pros
Top Cons
Best Alternatives
Apache SparkIBM SPSS Statistics
Small Businesses

No answers on this topic

IBM SPSS Modeler
IBM SPSS Modeler
Score 7.8 out of 10
Medium-sized Companies
Cloudera Manager
Cloudera Manager
Score 9.7 out of 10
Alteryx
Alteryx
Score 9.0 out of 10
Enterprises
IBM Analytics Engine
IBM Analytics Engine
Score 8.8 out of 10
IBM SPSS Modeler
IBM SPSS Modeler
Score 7.8 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
Apache SparkIBM SPSS Statistics
Likelihood to Recommend
9.9
(24 ratings)
8.5
(84 ratings)
Likelihood to Renew
10.0
(1 ratings)
8.6
(22 ratings)
Usability
10.0
(3 ratings)
8.0
(14 ratings)
Availability
-
(0 ratings)
6.0
(1 ratings)
Performance
-
(0 ratings)
6.0
(1 ratings)
Support Rating
8.7
(4 ratings)
6.4
(12 ratings)
Implementation Rating
-
(0 ratings)
8.7
(7 ratings)
Configurability
-
(0 ratings)
5.0
(1 ratings)
Ease of integration
-
(0 ratings)
5.0
(1 ratings)
Product Scalability
-
(0 ratings)
5.0
(1 ratings)
Vendor post-sale
-
(0 ratings)
5.0
(1 ratings)
Vendor pre-sale
-
(0 ratings)
5.0
(1 ratings)
User Testimonials
Apache SparkIBM SPSS Statistics
Likelihood to Recommend
Apache
Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible.
Read full review
IBM
SPSS is well-suited for the following: 1) User Behavior Analysis: SPSS handles large datasets to analyze user behavior data. 2) Customer Satisfaction / Foundational Surveys: SPSS facilitates analysis of quant data from satisfaction surveys, keeping us informed about customer needs and preferences. 3) A/B test analysis: SPSS statistical tools for A/B test analysis, which helps optimize user experience of our products. Scenarios where SPSS are less appropriate: 1) Qualitative Data Analysis: I do not use SPSS for open-ended survey responses/qual data. 2) Live/in-vivo data analysis: SPSS is not ideal for real-time data processing. 3) Complex Data Integration: SPSS isn’t the best fit for complex data integration tasks
Read full review
Pros
Apache
  • Apache Spark makes processing very large data sets possible. It handles these data sets in a fairly quick manner.
  • Apache Spark does a fairly good job implementing machine learning models for larger data sets.
  • Apache Spark seems to be a rapidly advancing software, with the new features making the software ever more straight-forward to use.
Read full review
IBM
  • SPSS has been around for quite a while and has amassed a large suite of functionality. One of its longest-running features is the ability to automate SPSS via scripting, AKA "syntax." There is a very large community of practice on the internet who can help newbies to quickly scale up their automation abilities with SPSS. And SPSS allows users to save syntax scripting directly from GUI wizards and configuration windows, which can be a real life-saver if one is not an experienced coder.
  • Many statistics package users are doing scientific research with an eye to publish reproducible results. SPSS allows you to save datasets and syntax scripting in a common format, facilitating attempts by peer reviewers and other researchers to quickly and easily attempt to reproduce your results. It's very portable!
  • SPSS has both legacy and modern visualization suites baked into the base software, giving users an easily mountable learning curve when it comes to outputting charts and graphs. It's very easy to start with a canned look and feel of an exported chart, and then you can tweak a saved copy to change just about everything, from colors, legends, and axis scaling, to orientation, labels, and grid lines. And when you've got a chart or graph set up the way you like, you can export it as an image file, or create a template syntax to apply to new visualizations going forward.
  • SPSS makes it easy for even beginner-level users to create statistical coding fields to support multidimensional analysis, ensuring that you never need to destructively modify your dataset.
  • In closing, SPSS's long and successful tenure ensures that just about any question a new user may have about it can be answered with a modicum of Google-fu. There are even several fully-fledged tutorial websites out there for newbie perusal.
Read full review
Cons
Apache
  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Read full review
IBM
  • It would be beneficial to have AMOS as part of the SPSS package instead of purchasing it separately.
  • It would be beneficial to have other statistical tests, such as PROCESS, be part of the standard SPSS tests instead of having the need to run a syntax to have it installed.
  • My dataset tends to be smaller, and I have never had any issues with using SPSS. I heard that SPSS may not be optimal when handling large datasets.
Read full review
Likelihood to Renew
Apache
Capacity of computing data in cluster and fast speed.
Read full review
IBM
Both
money and time are essential for success in terms of return on investment for any kind of research based project work. Using a Likert-scale questionnaire is very easy for data entry and analysis
using IBM SPSS. With the help of IBM SPSS, I found very fast and reliable data
entry and data analysis for my research. Output from SPSS is very easy to
interpret for data analysis and findings
Read full review
Usability
Apache
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review
IBM
Probably because I have been using it for so long that I have used all of the modules, or at least almost all of the modules, and the way SPSS works is second nature to me, like fish to swimming.
Read full review
Reliability and Availability
Apache
No answers on this topic
IBM
SPSS can tend to crash when I am trying to do a lot of data. This can slow me down when I need to do a lot of data
Read full review
Performance
Apache
No answers on this topic
IBM
SPSS does the job, but it can be slow. I do have to plan a lot of time to get through a huge amount of data.
Read full review
Support Rating
Apache
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review
IBM
I have not contacted IBM SPSS for support myself. However, our IT staff has for trying to get SPSS Text Analytics Module to work. The issue was never resolved, but I'm not sure if it was on the IT's end or on SPSS's end
Read full review
Implementation Rating
Apache
No answers on this topic
IBM
Have a plan for managing the yearly upgrade cycle. Most users work in the desktop version, so there needs to be a mechanism for either pushing out new versions of the software or a key manager to deal with updated licensing keys. If you have a lot of users this needs to be planned for in advance.
Read full review
Alternatives Considered
Apache
All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional type programming easily based on the situation. Also it doesn't need special data ingestion or indexing pre-processing like Presto. Combining it with Jupyter Notebooks (https://github.com/jupyter-incubator/sparkmagic), one can develop the Spark code in an interactive manner in Scala or Python
Read full review
IBM
I have used R when I didn't have access to SPSS. It takes me longer because I'm terrible at syntax but it is powerful and it can be enjoyable to only have to wrestle with syntax and not a difficult UI.
Read full review
Scalability
Apache
No answers on this topic
IBM
I am neutral because I have not had to look into scalability since I am using as a student.
Read full review
Return on Investment
Apache
  • Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark.
  • Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy.
  • Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs.
Read full review
IBM
  • IBM SPSS has allowed me to quickly analyze data for research.
  • IBM SPSS has allowed me to complete analyses in order to submit research findings to conferences and complete manuscripts.
  • IBM SPSS has enabled me to meet research objectives set out in grant proposals.
Read full review
ScreenShots