Apache Spark vs. Spotfire

Apache Spark

Spotfire

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Spark	Score 9.1 out of 10	N/A	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.	N/A
Spotfire	Score 8.2 out of 10	N/A	Spotfire, formerly known as TIBCO Spotfire, is a visual data science platform that combines visual analytics, data science, and data wrangling, so users can analyze data at-rest and at-scale to solve complex industry-specific problems.	N/A

Pricing

Apache Spark

Spotfire

Editions & Modules

No answers on this topic

Offerings

Pricing Offerings
Apache Spark	Spotfire
Free Trial
No	Yes
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	Yes

Entry-level Setup Fee

No setup fee

Additional Details

—

For Enterprise engagements, contact Spotfire directly for a custom price quote.

More Pricing Information

Community Pulse
	Apache Spark	Spotfire
Considered Both Products	Apache Spark No answer on this topic	Spotfire Verified User Engineer Chose Spotfire A few that are not listed are Metabase and ReDash--they are both open source. I like Spotfire the best by far. I was surprised how far behind it Tableau is. I could just never get the feel for Tableau, while I really enjoyed working in Spotfire. The open-source ones are nice … Incentivized Helpful?

Features

Apache Spark

Spotfire

Platform Connectivity

Comparison of Platform Connectivity features of Product A and Product B
	Apache Spark - Ratings	Spotfire 7.2 8 Ratings 15% below category average
Connect to Multiple Data Sources	00 Ratings	7.88 Ratings
Extend Existing Data Sources	00 Ratings	7.48 Ratings
Automatic Data Format Detection	00 Ratings	7.88 Ratings
MDM Integration	00 Ratings	6.05 Ratings

Data Exploration

Comparison of Data Exploration features of Product A and Product B
	Apache Spark - Ratings	Spotfire 9.1 8 Ratings 8% above category average
Visualization	00 Ratings	9.08 Ratings
Interactive Data Analysis	00 Ratings	9.28 Ratings

Data Preparation

Comparison of Data Preparation features of Product A and Product B
	Apache Spark - Ratings	Spotfire 7.4 8 Ratings 10% below category average
Interactive Data Cleaning and Enrichment	00 Ratings	7.28 Ratings
Data Transformations	00 Ratings	8.08 Ratings
Data Encryption	00 Ratings	7.05 Ratings
Built-in Processors	00 Ratings	7.55 Ratings

Platform Data Modeling

Comparison of Platform Data Modeling features of Product A and Product B
	Apache Spark - Ratings	Spotfire 7.6 8 Ratings 10% below category average
Multiple Model Development Languages and Tools	00 Ratings	7.57 Ratings
Automated Machine Learning	00 Ratings	8.55 Ratings
Single platform for multiple model development	00 Ratings	7.68 Ratings
Self-Service Model Delivery	00 Ratings	6.76 Ratings

Model Deployment

Comparison of Model Deployment features of Product A and Product B
	Apache Spark - Ratings	Spotfire 7.4 7 Ratings 14% below category average
Flexible Model Publishing Options	00 Ratings	7.87 Ratings
Security, Governance, and Cost Controls	00 Ratings	7.07 Ratings

Best Alternatives
	Apache Spark	Spotfire
Small Businesses	No answers on this topic	Jupyter Notebook Score 8.6 out of 10
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Posit Score 10.0 out of 10
Enterprises	IBM Analytics Engine Score 7.2 out of 10	Posit Score 10.0 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Spark	Spotfire
Likelihood to Recommend	9.0 (24 ratings)	8.4 (351 ratings)
Likelihood to Renew	10.0 (1 ratings)	9.6 (30 ratings)
Usability	8.0 (4 ratings)	8.0 (27 ratings)
Availability	- (0 ratings)	9.0 (14 ratings)
Performance	- (0 ratings)	7.1 (14 ratings)
Support Rating	8.7 (4 ratings)	8.7 (27 ratings)
In-Person Training	- (0 ratings)	8.3 (52 ratings)
Online Training	- (0 ratings)	9.0 (55 ratings)
Implementation Rating	- (0 ratings)	8.4 (17 ratings)
Configurability	- (0 ratings)	7.1 (3 ratings)
Ease of integration	- (0 ratings)	7.0 (2 ratings)
Product Scalability	- (0 ratings)	7.0 (4 ratings)
Vendor post-sale	- (0 ratings)	5.0 (1 ratings)
Vendor pre-sale	- (0 ratings)	5.0 (1 ratings)

User Testimonials
	Apache Spark	Spotfire
Likelihood to Recommend	Apache Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible. Incentivized Ananth Gouri Assistant Professor Read full review	Spotfire A high level of data integration is available here it supports various data sources and so on. Collaborating features allow users to give access to the dashboard and merge data analytics with other team members. It can meet the demands of both small and large size business enterprises. A customized dashboard and reports are provided to meet the specific needs and get support of extensibility through APIs and customized scripts. Incentivized SV Simran Verma Software Developer Read full review
Pros	Apache Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Incentivized Nitin Pasumarthy Software Engineer Read full review	Spotfire It has the best coding integration (python, R) of any BI product The ability to work with very large datasets (10 mil+) is better than competitors Export options are more complete and have better functionality The data canvas is the best tool to join and transform data vs. competitors Incentivized JP Jim Putnam Director, Advanced Analytics and Data Science Read full review
Cons	Apache Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Incentivized Anson Abraham Data Czar Read full review	Spotfire The donut chart is I guess a powerful illustrations but I hope it should be done quite simple in Spotfire. But in Spotfire there are lots of steps involve just to build a simple donut chart. Table calculation (like Row or Column Differences) should be made simple or there should be drag and drop function for Table Calculation. No need for scripting. Information Link should be changed. If new columns are added to the table just refreshing the data should be able to capture the new column. No need extra step to add column Incentivized Mark Edralin Marketing Database Analyst Read full review
Likelihood to Renew	Apache Capacity of computing data in cluster and fast speed. Steven Li Senior Software Developer (Consultant) Read full review	Spotfire -Easy to distribute information throughout the enterprise using the webplayer. -Ad hoc analysis is possible throughout the enterprise using business author in the webplayer or the thick client. -Low level of support needed by IT team. Access interfaces with LDAP and numerous other authentication methods. -Possible to continually extend the platform with JavaScript, R scripts, HTML, and custom extensions. -Ability to standardize data logic through pre-built queries in the Information Designer. Everyone in the enterprise is using the same logic -Tagging and bookmarking data allows for quick sharing of insights. -Integration with numerous data sources... flat files, data bases, big data, images, etc. -Much improved mapping capability. Also includes the ability to apply data points over any image. Brent Meyers Data Visualization Developer Read full review
Usability	Apache If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used Incentivized Verified User Anonymous Read full review	Spotfire Basic tasks like generating meaningful information from large sets of raw data are very easy. The next step of linking to multiple live data sources and linking those tables and performing on the fly analysis of the imported data is understandably more difficult. Incentivized Brock Robertson Reservoir Engineer Read full review
Reliability and Availability	Apache No answers on this topic	Spotfire Even though, it's a rather stable and predictable tool that's also fast, it does have some bugs and inconsistencies that shut down the system. Depending on the details, it could happen as often as 2-3 times a week, especially during the development period. Alex Naumov Global Pricing and Marketing Operations Lead, Analytics & Research Read full review
Performance	Apache No answers on this topic	Spotfire Generally, the Spotfire client runs with very good performance. There are factors that could affect performance, but normally has to do with loading large analysis files from the library if the database is located some distance away and your global network is not optimal. Once you have your data table(s) loaded in the client application, usually the application is quite good performance-wise. Verified User Anonymous Read full review
Support Rating	Apache 1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications. YM Yogesh Mhasde Technical Manager Read full review	Spotfire Support has been helpful with issues. Support seems to know their product and its capabilities. It would also seem that they have a good sense of the context of the problem; where we are going with this issue and what we want the end outcome to be. Incentivized Tim Daciuk Product Manager - Mobile Computing Analytics Cloud Platform Read full review
In-Person Training	Apache No answers on this topic	Spotfire The instructor was very in depth and provided relevant training to business users on how to create visualizations. They showed us how to alter settings and filter views, and provided resources for future questions. However, the instructor failed to cover data sources, connecting to data, etc. While it was helpful to see how users can use the data to create reports, they failed to properly instruct us on how to get the dataset in to begin with. We are still trying to figure out connections to certain databases (we have multiple different types). Incentivized Verified User Anonymous Read full review
Online Training	Apache No answers on this topic	Spotfire The online training is good, provides a good base of knowledge. The video demonstrations were well-done and easy to follow along. Provided exercises are good as well, but I think there could be more challenging exercises. The training has also gone up in price significantly in the last 3 years (in USD, which hurts us even more in Canada), and I'm not sure it is worth the money it now costs (it is worth how much it cost 3 years ago, but not double that.) Incentivized Emilie Wheeler Operations Analyst II Read full review
Implementation Rating	Apache No answers on this topic	Spotfire The original architecture I created for our implementation had only a particular set of internal business units in mind. Over the years, Spotfire gained in popularity in our company and was being utilized across many more business units. Soon, its usage went beyond what the original architectural implementation could provide. We've since learned about how the product is used by the different teams and are currently in the middle of rolling out a new architecture. I suggest: Have clearly defined service level agreements with all the teams that will use Spotfire. Your business intelligence group might only need availability during normal working hours, but your production support group might need 24/7 availability. If these groups share one Spotfire server, maintenance of that server might be a problem. Know the different types of data you will be working with. One group might be working with "public" data while another group might work with sensitive data. Design your Library accordingly and with the proper permissions. Know the roles of the users of Spotfire. Will there only be a small set of report writers or does everyone have write access to the Library? ALWAYS add a timestamp prompt to your reports. You don't want multiple users opening a report that will try and pull down millions of rows of data to their local workstations. Another option, of course, is to just hard code a time range in the backing database view (i.e. where activity_date >= sysdate - 90, etc.), but I'd rather educate/train the user base if possible. This probably goes without saying, but if possible, point to a separate reporting database or a logical standby database. You don't want the company pounding on your primaries and take down your order system. Michael Soliman Senior Database Developer Read full review
Alternatives Considered	Apache Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing. Incentivized Verified User Anonymous Read full review	Spotfire Spotfire is significantly ahead of both products from an ETL and data ingestion capability. Spotfire also has substantially better visualizations than Power BI, and although the native visualizations aren't as flexible in Tableau, Spotfire enables users to create completely custom javascript visaualizations, which neither Tableau or Power BI has. Tableau and Power BI are likely only superior to Spotfire with respect to embedded analysis on a website. Verified User Anonymous Read full review
Scalability	Apache No answers on this topic	Spotfire In an enterprise architecture, if Spotfire Advanced Data services(Composite Studio),data marts can be managed optimally and scalability in a data perspective is great. As the web player/consumer is directly proportional to RAM, if the enterprise can handle RAM requirement accomodating fail over mechanisms appropraitely, it is definitely scalable, Incentivized Verified User Anonymous Read full review
Return on Investment	Apache Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Incentivized Surendranatha Reddy Chappidi Senior Data Engineer Read full review	Spotfire It is costly, so not suitable for small scale implementations. Dashboards are as good as the developer, so need experience to get most out of it You need to be on Spotfire 11 at least to implement out of the box visualizations Integration with Python and R is a game changer, it comes very handy to onboard data scientists without much hassle performance is exceptionally well. Secure Incentivized Verified User Anonymous Read full review
ScreenShots		Spotfire Screenshots