Skip to main content
TrustRadius
Dataiku

Dataiku

Overview

What is Dataiku?

Dataiku is a French startup and its product, DSS, is a challenger to market incumbents and features some visual tools to assist in building workflows.

Read more
Recent Reviews

TrustRadius Insights

Versatile Data Handling: Users have praised Dataiku DSS for its versatility in handling various data sources, including Python, R, SQL, …
Continue reading
Read all reviews
Return to navigation

Pricing

View all pricing

Discover

Contact sales team

Cloud

Business

Contact sales team

Cloud

Enterprise

Contact sales team

Cloud

Entry-level set up fee?

  • No setup fee

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services
Return to navigation

Product Demos

Dataiku DSS Demo: End-to-End [Portuguese]

YouTube

Demo Dataiku DSS ภาษาไทย

YouTube

End-to-End ML with Dataiku DSS

YouTube

Learn Data science Fast and Easy without code - Dataiku Demo

YouTube
Return to navigation

Product Details

Dataiku Technical Details

Deployment TypesOn-premise, Software as a Service (SaaS), Cloud, or Web-Based
Operating SystemsWindows, Linux, Mac, VirtualBox / VMWare
Mobile ApplicationNo

Frequently Asked Questions

Dataiku is a French startup and its product, DSS, is a challenger to market incumbents and features some visual tools to assist in building workflows.

DataRobot, H2O.ai, and KNIME Analytics Platform are common alternatives for Dataiku.

Reviewers rate Connect to Multiple Data Sources and Extend Existing Data Sources and Automatic Data Format Detection highest, with a score of 10.

The most common users of Dataiku are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(21)

Community Insights

TrustRadius Insights are summaries of user sentiment data from TrustRadius reviews and, when necessary, 3rd-party data sources. Have feedback on this content? Let us know!

Versatile Data Handling: Users have praised Dataiku DSS for its versatility in handling various data sources, including Python, R, SQL, and built-in tools. Some reviewers found this ability to transform unorganized data into valuable information through intuitive dashboards to be a crucial feature.

Manageable Data Pipelines: The presence of inbuilt recipes in Dataiku DSS has made data pipelines more manageable for users. This modular approach to pipeline creation and the availability of pre-built recipes for data transformation have been appreciated by several reviewers.

Ease of Use: Many users have highlighted the ease of use of Dataiku DSS. The platform's inclusion of all majorly applied operations as direct 'recipes' and the visual flow element that helps users keep track of their work intuitively are some factors that contribute to its user-friendly nature.

  1. User Interface: Some users have mentioned that the user interface of Dataiku DSS could use some improvements as it is not intuitive or easy to navigate. They have found it challenging to locate certain features and perform tasks efficiently.

  2. Limited support for multi-file projects: Several users have expressed frustration with the limited support for using multi-file projects as a recipe or pipeline in Dataiku DSS. They feel that this feature is not robust enough, making it difficult to handle complex workflows involving multiple files.

  3. Processing time for multi-user purposes: A number of users have experienced prolonged and stressful processing times when using Dataiku DSS for multi-user purposes, even with reduced users and training data. This has resulted in delays and inefficiencies in their workflow management.

Users highly recommend using Anaconda and RStudio for data transformation and analysis, as they believe these tools are the best in the industry for these tasks.

DSS is recommended over other tools for handling big data and sharing flows. Users suggest that it provides better functionality and performance compared to its competitors.

To gain a better understanding of all the components of the data, users suggest getting certification and using a data catalog. This helps users navigate and comprehend complex datasets more effectively.

Attribute Ratings

Reviews

(1-4 of 4)
Companies can't remove reviews or game the system. Here's why
Score 10 out of 10
Vetted Review
Verified User
Incentivized
Dataiku DSS is being used in my team to perform various tasks which ranges from data preprocessing to machine learning model creation. It provides a one-stop solution to fetch data from different sources such as Amazon S3, SQL Server databases, etc. and merge them onto a single platform. We use Dataiku DSS to perform data imputations, data cleaning and feature engineering to prepare datasets for creating machine learning models. We also extract business insights (data analytics) using various statistical methods and visual representations such as scatter plots, histograms, boxplots, etc. Furthermore, optimized ML models are created which are used to predict/forecast target variables and drive business decisions.
  • Allows users to collaborate and monitor individual tasks
  • Caters to both types of analysts, coders and non-coders, alike
  • Integrate graphs and plots with visualization tools such as Tableau
  • Its community support is very limited at the moment
  • Complex to integrate with automation tools such as Blue Prism
Dataiku DSS is very well suited to handle large datasets and projects which requires a huge team to deliver results. This allows users to collaborate with each other while working on individual tasks. The workflow is easily streamlined and every action is backed up, allowing users to revert to specific tasks whenever required.
While Dataiku DSS works seamlessly with all types of projects dealing with structured datasets, I haven't come across projects using Dataiku dealing with images/audio signals. But a workaround would be to store the images as vectors and perform the necessary tasks.
  • Very friendly interface for users
  • All data analytics services provided on a single platform
  • Keeps track of all models created and every actions performed on a dataset
Platform Connectivity (4)
75%
7.5
Connect to Multiple Data Sources
100%
10.0
Extend Existing Data Sources
100%
10.0
Automatic Data Format Detection
100%
10.0
MDM Integration
N/A
N/A
Data Exploration (2)
100%
10.0
Visualization
100%
10.0
Interactive Data Analysis
100%
10.0
Data Preparation (4)
100%
10.0
Interactive Data Cleaning and Enrichment
100%
10.0
Data Transformations
100%
10.0
Data Encryption
100%
10.0
Built-in Processors
100%
10.0
Platform Data Modeling (4)
87.5%
8.8
Multiple Model Development Languages and Tools
50%
5.0
Automated Machine Learning
100%
10.0
Single platform for multiple model development
100%
10.0
Self-Service Model Delivery
100%
10.0
Model Deployment (2)
90%
9.0
Flexible Model Publishing Options
90%
9.0
Security, Governance, and Cost Controls
90%
9.0
  • Customer satisfaction
  • Timely project delivery
Strictly for Data Science operations, Anaconda can be considered as a subset of Dataiku DSS. While Anaconda supports Python and R programming languages, Dataiku also provides this facility, but also provides GUI to creates models with just a click of a button. This provides the flexibility to users who do not wish to alter the model hyperparameters in greater depths. Writing codes to extract meaningful data is time consuming compared to Dataiku's ability to perform feature engineering and data transformation through click of a button.
Devesh Singh | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Incentivized
I work for a client who's implementing all their data science solutions on Dataiku DSS. I have been working on implementing these solutions for my client for over a year now. Currently, it is channeled through the IT department and we have expanded across multiple departments ranging from finance to sales. We have implemented multiple time series forecasting projects, NLP, business optimization as well as customized analytics data flow for the finance department.
  • The intuitiveness of this tool is very good.
  • Click or Code - If you are a coder, you can code. If you are a manager, you can wrangle with data with visuals
  • The way you can control things, the set of APIs gives a lot of flexibility to a developer.
  • The integrated windows of frontend and backend in web applications make it cumbersome for the developer.
  • When dealing with multiple data flows, it becomes really confusing, though they have introduced a feature (Zones) to cater to this issue.
  • Bundling, exporting, and importing projects sometimes create issues related to code environment. If the code environment is not available, at least the schema of the flow we should be able to import should be.
I would recommend it because it's an amazing tool for different levels of users. From Business Analysts to Data Scientists to Managers, various employees can make use of this tool to make data-driven decisions. I'm not sure about where it would be less appropriate as I'm using it as Data Scientist and so far it pretty much caters to my need.
Platform Connectivity (4)
75%
7.5
Connect to Multiple Data Sources
100%
10.0
Extend Existing Data Sources
100%
10.0
Automatic Data Format Detection
100%
10.0
MDM Integration
N/A
N/A
Data Exploration (2)
70%
7.0
Visualization
60%
6.0
Interactive Data Analysis
80%
8.0
Data Preparation (4)
95%
9.5
Interactive Data Cleaning and Enrichment
100%
10.0
Data Transformations
90%
9.0
Data Encryption
90%
9.0
Built-in Processors
100%
10.0
Platform Data Modeling (4)
97.5%
9.8
Multiple Model Development Languages and Tools
90%
9.0
Automated Machine Learning
100%
10.0
Single platform for multiple model development
100%
10.0
Self-Service Model Delivery
100%
10.0
Model Deployment (2)
90%
9.0
Flexible Model Publishing Options
90%
9.0
Security, Governance, and Cost Controls
90%
9.0
  • So far it has had a positive impact. Multiple departments are coming to us with their business problems.
  • I can't specifically say about ROI as I'm a developer, though I have heard this solution is economical compared to other AI/ML enterprise tools.
  • By using this tool, my client has let go of software that was used earlier, and we have created a simpler framework to replace that software.
I cannot comment on it as I have not used these tools as such.
The amazing part of Dataiku DSS is their customer service. Based on urgency and technical level, you get a reply from the Dataiku engineer when you raise a query. So far, my queries have been pretty complex to solve, so I have received solutions even from the CTO of the company as well, which is why I would describe their customer support as very good.
As I have described earlier, the intuitiveness of this tool makes it great as well as the variety of users that can use this tool. Also, the plugins available in their repository provide solutions to various data science problems.
Score 8 out of 10
Vetted Review
Verified User
Incentivized
Platform is currently being used solely by our department due to licensing cost and budget reasons.

Platform offers a one stop shop for an (almost) end to end development of data analytics and machine learning products, including data import, manipulation, and visualization. It’s a low-code tool, and supports majority of workflow without the need for in-depth coding skills; this is a plus for exposing platform across a wider audience and use cases.
  • Low-code platform.
  • Open source version includes most valuable modules.
  • User friendly documentation.
  • End product deployment.
For team(s) with varying levels of coding skills, the platform offers a one size fit all for most data analytics and machine learning projects that are of practical use in industrial settings (e.g. time series forecasting, predictive maintenance and production optimization). In research and development work, where projects are cutting edge and no out of the box solutions are available, then platform is of minimal use, since custom data ingestion and manipulation may be required.
Platform Connectivity (4)
85%
8.5
Connect to Multiple Data Sources
90%
9.0
Extend Existing Data Sources
90%
9.0
Automatic Data Format Detection
90%
9.0
MDM Integration
70%
7.0
Data Exploration (2)
85%
8.5
Visualization
90%
9.0
Interactive Data Analysis
80%
8.0
Data Preparation (4)
80%
8.0
Interactive Data Cleaning and Enrichment
90%
9.0
Data Transformations
90%
9.0
Data Encryption
70%
7.0
Built-in Processors
70%
7.0
Platform Data Modeling (4)
62.5%
6.3
Multiple Model Development Languages and Tools
70%
7.0
Automated Machine Learning
60%
6.0
Single platform for multiple model development
60%
6.0
Self-Service Model Delivery
60%
6.0
Model Deployment (2)
65%
6.5
Flexible Model Publishing Options
60%
6.0
Security, Governance, and Cost Controls
70%
7.0
  • Given its open source status, only cost is the learning curve, which is minimal compared to time savings for data exploration.
  • Platform also ease tracking of data processing workflow, unlike Excel.
  • Build-in data visualizations covers many use cases with minimal customization; time saver.
Open source availability is a critical factor given licensing cost of other platforms and budget reasons. Secondly, the available features in the community version covers most of the use cases, thus making it comparable or even outdo commercial versions of other software. Finally, being able to install and deploy platform in a plug and play mode really accelerates its adoption across wider audience.
The open source user community is friendly, helpful, and responsive, at times even outdoing commercial software vendors. Documentation is also top notch, and usually resolves issues without the need for human interactions. Great product design, with a focus on user experience, also makes platform use intuitive, thus reducing the need for explicit support.
Score 7 out of 10
Vetted Review
Verified User
Incentivized
Dataiku is being used as the integrated data analytic/AI/ML platform. It is a corporate-level standard solution, across multiple regions and business domains. The data scientists use this platform to develop various data pipelines, and/or train the AI/ML models, verify the model performances and eventually deploy the model as service to benefit business critical IT applications (majorly serve the predictive analysis/automation and integration with RPA).
  • Very intuitive and easy to use UI, making a lot of types of users can collaborate with each other easily, by visualizing the same workflow.
  • Many building blocks can be reused immediately, avoid a lot of non-standard boiler plate implementation.
  • Data pre-analysis and feature engineering assistance increase the productivity as well as the efficiency of data scientists.
  • Many data connectors support wide range of data storage, from SQL, TeraData, Hadoop Hive, etc.
  • Support from research till final MaaS solution deployment.
  • The visualization feature of flow still has a lot room to improve, when the flow is complex.
  • The "non-coding" template/building block for deep learning lack of many important configurable parameters.
  • Lack of the unified way to allow applying the "design pattern" on the Python codes (if we want to develop our own module or building blocks.
Dataiku is suitable for many steps of data processing pipeline development (from data collecting, filtering till cleaning, transformation and enhancement), and it is also good for the user who doesn't have too much in-depth AI/ML knowledge to quickly jump into it and give a try to solve some real-world problem.
Platform Connectivity (4)
85%
8.5
Connect to Multiple Data Sources
100%
10.0
Extend Existing Data Sources
90%
9.0
Automatic Data Format Detection
90%
9.0
MDM Integration
60%
6.0
Data Exploration (2)
80%
8.0
Visualization
80%
8.0
Interactive Data Analysis
80%
8.0
Data Preparation (4)
70%
7.0
Interactive Data Cleaning and Enrichment
80%
8.0
Data Transformations
80%
8.0
Data Encryption
50%
5.0
Built-in Processors
70%
7.0
Platform Data Modeling (4)
70%
7.0
Multiple Model Development Languages and Tools
70%
7.0
Automated Machine Learning
70%
7.0
Single platform for multiple model development
70%
7.0
Self-Service Model Delivery
70%
7.0
Model Deployment (2)
70%
7.0
Flexible Model Publishing Options
70%
7.0
Security, Governance, and Cost Controls
70%
7.0
  • Dataiku provides a consistent platform, covering almost all needs from the data analytic till AI/ML areas.
  • This platform "glues" all departments and business flows and IT data source together, making the data more exploitative.
Anaconda is mainly used by professional data scientists who have profound knowledge of Python coding, mainly used for building some new algorithm block or some optimization, then the module will be integrated into the Dataiku pipeline/workflow. While Dataiku can be used by even other kinds of users.
The support team is very helpful, and even when we discover the missing features, after providing enough rational reasons and requirements, they put into it their development pipeline for the future release.
Chameleon, Cloudera DataFlow (formerly Hortonworks DataFlow), Sparx Systems Enterprise Architect
Return to navigation