Likelihood to Recommend Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible.
Read full review A high level of data integration is available here it supports various data sources and so on. Collaborating features allow users to give access to the dashboard and merge data analytics with other team members. It can meet the demands of both small and large size business enterprises. A customized dashboard and reports are provided to meet the specific needs and get support of extensibility through APIs and customized scripts.
Read full review Pros Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Read full review It has the best coding integration (python, R) of any BI product The ability to work with very large datasets (10 mil+) is better than competitors Export options are more complete and have better functionality The data canvas is the best tool to join and transform data vs. competitors Jim Putnam Director, Advanced Analytics and Data Science
Read full review Cons Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review The donut chart is I guess a powerful illustrations but I hope it should be done quite simple in Spotfire. But in Spotfire there are lots of steps involve just to build a simple donut chart. Table calculation (like Row or Column Differences) should be made simple or there should be drag and drop function for Table Calculation. No need for scripting. Information Link should be changed. If new columns are added to the table just refreshing the data should be able to capture the new column. No need extra step to add column Read full review Likelihood to Renew Capacity of computing data in cluster and fast speed.
Steven Li Senior Software Developer (Consultant)
Read full review -Easy to distribute information throughout the enterprise using the webplayer. -Ad hoc analysis is possible throughout the enterprise using business author in the webplayer or the thick client. -Low level of support needed by IT team. Access interfaces with LDAP and numerous other authentication methods. -Possible to continually extend the platform with JavaScript, R scripts, HTML, and custom extensions. -Ability to standardize data logic through pre-built queries in the Information Designer. Everyone in the enterprise is using the same logic -Tagging and bookmarking data allows for quick sharing of insights. -Integration with numerous data sources... flat files, data bases, big data, images, etc. -Much improved mapping capability. Also includes the ability to apply data points over any image.
Read full review Usability If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by
dbt core), which increase the scenarios where it can be used
Read full review Basic tasks like generating meaningful information from large sets of raw data are very easy. The next step of linking to multiple live data sources and linking those tables and performing on the fly analysis of the imported data is understandably more difficult.
Read full review Reliability and Availability Even though, it's a rather stable and predictable tool that's also fast, it does have some bugs and inconsistencies that shut down the system. Depending on the details, it could happen as often as 2-3 times a week, especially during the development period.
Alex Naumov Global Pricing and Marketing Operations Lead, Analytics & Research
Read full review Performance Generally, the Spotfire client runs with very good performance. There are factors that could affect performance, but normally has to do with loading large analysis files from the library if the database is located some distance away and your global network is not optimal. Once you have your data table(s) loaded in the client application, usually the application is quite good performance-wise.
Read full review Support Rating 1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review Support has been helpful with issues. Support seems to know their product and its capabilities. It would also seem that they have a good sense of the context of the problem; where we are going with this issue and what we want the end outcome to be.
Tim Daciuk Product Manager - Mobile Computing Analytics Cloud Platform
Read full review In-Person Training The instructor was very in depth and provided relevant training to business users on how to create visualizations. They showed us how to alter settings and filter views, and provided resources for future questions. However, the instructor failed to cover data sources, connecting to data, etc. While it was helpful to see how users can use the data to create reports, they failed to properly instruct us on how to get the dataset in to begin with. We are still trying to figure out connections to certain databases (we have multiple different types).
Read full review Online Training The online training is good, provides a good base of knowledge. The video demonstrations were well-done and easy to follow along. Provided exercises are good as well, but I think there could be more challenging exercises. The training has also gone up in price significantly in the last 3 years (in USD, which hurts us even more in Canada), and I'm not sure it is worth the money it now costs (it is worth how much it cost 3 years ago, but not double that.)
Read full review Implementation Rating The original architecture I created for our implementation had only a particular set of internal business units in mind. Over the years, Spotfire gained in popularity in our company and was being utilized across many more business units. Soon, its usage went beyond what the original architectural implementation could provide. We've since learned about how the product is used by the different teams and are currently in the middle of rolling out a new architecture. I suggest:
Have clearly defined service level agreements with all the teams that will use Spotfire. Your business intelligence group might only need availability during normal working hours, but your production support group might need 24/7 availability. If these groups share one Spotfire server, maintenance of that server might be a problem. Know the different types of data you will be working with. One group might be working with "public" data while another group might work with sensitive data. Design your Library accordingly and with the proper permissions. Know the roles of the users of Spotfire. Will there only be a small set of report writers or does everyone have write access to the Library? ALWAYS add a timestamp prompt to your reports. You don't want multiple users opening a report that will try and pull down millions of rows of data to their local workstations. Another option, of course, is to just hard code a time range in the backing database view (i.e. where activity_date >= sysdate - 90, etc.), but I'd rather educate/train the user base if possible. This probably goes without saying, but if possible, point to a separate reporting database or a logical standby database. You don't want the company pounding on your primaries and take down your order system. Read full review Alternatives Considered Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the
Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Read full review Spotfire is significantly ahead of both products from an ETL and data ingestion capability. Spotfire also has substantially better visualizations than Power BI, and although the native visualizations aren't as flexible in
Tableau , Spotfire enables users to create completely custom javascript visaualizations, which neither
Tableau or Power BI has.
Tableau and Power BI are likely only superior to Spotfire with respect to embedded analysis on a website.
Read full review Scalability In an enterprise architecture, if Spotfire Advanced Data services(Composite Studio),data marts can be managed optimally and scalability in a data perspective is great. As the web player/consumer is directly proportional to RAM, if the enterprise can handle RAM requirement accomodating fail over mechanisms appropraitely, it is definitely scalable,
Read full review Return on Investment Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Read full review It is costly, so not suitable for small scale implementations. Dashboards are as good as the developer, so need experience to get most out of it You need to be on Spotfire 11 at least to implement out of the box visualizations Integration with Python and R is a game changer, it comes very handy to onboard data scientists without much hassle performance is exceptionally well. Secure Read full review ScreenShots