Likelihood to Recommend Organizations which already implemented on-premise Hadoop based Cloudera Data Platform (CDH) for their Big Data warehouse architecture will definitely get more value from seamless integration of Cloudera Data Science Workbench (CDSW) with their existing CDH Platform. However, for organizations with hybrid (cloud and on-premise) data platform without prior implementation of CDH, implementing CDSW can be a challenge technically and financially.
Read full review Dataiku DSS is very well suited to handle large datasets and projects which requires a huge team to deliver results. This allows users to collaborate with each other while working on individual tasks. The workflow is easily streamlined and every action is backed up, allowing users to revert to specific tasks whenever required. While Dataiku DSS works seamlessly with all types of projects dealing with structured datasets, I haven't come across projects using Dataiku dealing with images/audio signals. But a workaround would be to store the images as vectors and perform the necessary tasks.
Read full review Pros One single IDE (browser based application) that makes Scala, R, Python integrated under one tool For larger organizations/teams, it lets you be self reliant As it sits on your cluster, it has very easy access of all the data on the HDFS Linking with Github is a very good way to keep the code versions intact Read full review The intuitiveness of this tool is very good. Click or Code - If you are a coder, you can code. If you are a manager, you can wrangle with data with visuals The way you can control things, the set of APIs gives a lot of flexibility to a developer. Read full review Cons Installation is difficult. Upgrades are difficult. Licensing options are not flexible. Read full review Read full review Usability As I have described earlier, the intuitiveness of this tool makes it great as well as the variety of users that can use this tool. Also, the plugins available in their repository provide solutions to various data science problems.
Read full review Support Rating Cloudera Data Science Workbench has excellence online resources support such as documentation and examples. On top of that the enterprise license also comes with SLA on opening a ticket to Cloudera Services and support for complaint handling and troubleshooting by email or through a phone call. On top of that it also offers additional paid training services.
Read full review The support team is very helpful, and even when we discover the missing features, after providing enough rational reasons and requirements, they put into it their development pipeline for the future release.
Read full review Alternatives Considered Both the tools have similar features and have made it pretty easy to install/deploy/use. Depending on your existing platform (Cloudera vs. Azure) you need to pick the Workbench. Another observation is that Cloudera has better support where you can get feedback on your questions pretty fast (unlike MS). As its a new product, I expect MS to be more efficient in handling customers questions.
Read full review Strictly for Data Science operations,
Anaconda can be considered as a subset of Dataiku DSS. While
Anaconda supports Python and R programming languages, Dataiku also provides this facility, but also provides GUI to creates models with just a click of a button. This provides the flexibility to users who do not wish to alter the model hyperparameters in greater depths. Writing codes to extract meaningful data is time consuming compared to Dataiku's ability to perform feature engineering and data transformation through click of a button.
Read full review Return on Investment Paid off for demonstration purposes. Read full review Given its open source status, only cost is the learning curve, which is minimal compared to time savings for data exploration. Platform also ease tracking of data processing workflow, unlike Excel. Build-in data visualizations covers many use cases with minimal customization; time saver. Read full review ScreenShots