Likelihood to Recommend Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible.
Read full review Tableau Desktop is one the finest tool available in the market with such a wide range of capabilities in its suite that makes it easy to generate insights. Further, if optimally designed, then its reports are fairly simple to understand, yet capable enough to make changes at the required levels. One can create a variety of visualizations as required by the business or the clients. The data pipelines in the backend are very robust. The tableau desktop also provides options to develop the reports in developer mode, which is one of the finest features to embed and execute even the most complex possible logic. It's easier to operate, simple to navigate, and fluent to understand by the users.
Read full review Pros Apache Spark makes processing very large data sets possible. It handles these data sets in a fairly quick manner. Apache Spark does a fairly good job implementing machine learning models for larger data sets. Apache Spark seems to be a rapidly advancing software, with the new features making the software ever more straight-forward to use. Read full review An excellent tool for data visualization, it presents information in an appealing visual format—an exceptional platform for storing and analyzing data in any size organization. Through interactive parameters, it enables real-time interaction with the user and is easy to learn and get support from the community. Read full review Cons Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review Formatting the data to work correctly in graphical presentations can be time consuming Daily data extracts can run slowly depending on how much data is required and the source of the data The desktop version is required for advanced functionality, editing on [the] Tableau server allows only limited features Read full review Likelihood to Renew Capacity of computing data in cluster and fast speed.
Steven Li Senior Software Developer (Consultant)
Read full review Our use of Tableau Desktop is still fairly low, and will continue over time. The only real concern is around cost of the licenses, and I have mentioned this to Tableau and fully expect the development of more sensible models for our industry. This will remove any impediment to expansion of our use.
Read full review Usability The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review Tableau Desktop has proven to be a lifesaver in many situations. Once we've completed the initial setup, it's simple to use. It has all of the features we need to quickly and efficiently synthesize our data. Tableau Desktop has advanced capabilities to improve our company's data structure and enable self-service for our employees.
Read full review Reliability and Availability When used as a stand-alone tool, Tableau Desktop has unlimited uptime, which is always nice. When used in conjunction with Tableau Server, this tool has as much uptime as your server admins are willing to give it. All in all, I've never had an issue with Tableau's availability.
Read full review Performance Tableau Desktop's performance is solid. You can really dig into a large dataset in the form of a spreadsheet, and it exhibits similarly good performance when accessing a moderately sized Oracle database. I noticed that with Tableau Desktop 9.3, the performance using a spreadsheet started to slow around 75K rows by about 60 columns. This was easily remedied by creating an extract and pushing it to Tableau Server, where performance went to lightning fast
Read full review Support Rating 1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review I have never really used support much, to be honest. I think the support is not as user-friendly to search and use it. I did have an encounter with them once and it required a bit of going back and forth for licensing before reaching a resolution. They did solve my issue though
Read full review In-Person Training It is admittedly hard to train a group of people with disparate levels of ability coming in, but the software is so easy to use that this is not a huge problem; anyone who can follow simple instructions can catch up pretty quickly.
Read full review Online Training The training for new users are quite good because it covers topic wise training and the best part was that it also had video tutorials which are very helpful
Read full review Implementation Rating Again, training is the key and the company provides a lot of example videos that will help users discover use cases that will greatly assist their creation of original visualizations. As with any new software tool, productivity will decline for a period. In the case of Tableau, the decline period is short and the later gains are well worth it.
David Fickes Decision Sciences - Modeling, Simulation & Analysis
Read full review Alternatives Considered All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional type programming easily based on the situation. Also it doesn't need special data ingestion or indexing pre-processing like
Presto . Combining it with Jupyter Notebooks (
https://github.com/jupyter-incubator/sparkmagic ), one can develop the Spark code in an interactive manner in Scala or Python
Read full review If we do not have legacy tools which have already been set up, I would switch the visualization method to open source software via
PyCharm ,
Atom , and
Visual Studio IDE . These IDEs cannot directly help you to visualize the data but you can use many python packages to do so through these IDEs.
Read full review Scalability Tableau Desktop's scaleability is really limited to the scale of your back-end data systems. If you want to pull down an extract and work quickly in-memory, in my application it scaled to a few tens of millions of rows using the in-memory engine. But it's really only limited by your back-end data store if you have or are willing to invest in an optimized SQL store or purpose-built query engine like Veritca or Netezza or something similar.
Read full review Return on Investment Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark. Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy. Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs. Read full review Tableau was acquired years ago, and has provided good value with the content created. Ongoing maintenance costs for the platform, both to maintain desktop and server licensing has made the continuing value questionable when compared to other offerings in the marketplace. Users have largely been satisfied with the content, but not with the overall performance. This is due to a combination of factors including the performance of the Tableau engines as well as development deficiencies. Read full review ScreenShots