The DataRobot AI Platform is presented as a solution that accelerates and democratizes data science by automating the end-to-end journey from data to value and allows users to deploy AI applications at scale. DataRobot provides a centrally governed platform that gives users AI to drive business outcomes, that is available on the user's cloud platform-of-choice, on-premise, or as a fully-managed service. The solutions include tools providing data preparation enabling users to explore and…
$0
TensorFlow
Score 7.9 out of 10
N/A
TensorFlow is an open-source machine learning software library for numerical computation using data flow graphs. It was originally developed by Google.
I was only involved in manual model creation using python packages such as Sklearn and TensorFlow, and can attest that no matter how much time I spend with model creation, DataRobot will beat my manual models in accuracy and precision. Why waste time on something that is …
DataRobot can be used for risk assessment, such as predicting the likelihood of loan default. It can handle both classification and regression tasks effectively. It relies on historical data for model training. If you have limited historical data or the data quality is poor, it may not be the best choice as it requires a sufficient amount of high-quality data for accurate model building.
TensorFlow is great for most deep learning purposes. This is especially true in two domains: 1. Computer vision: image classification, object detection and image generation via generative adversarial networks 2. Natural language processing: text classification and generation. The good community support often means that a lot of off-the-shelf models can be used to prove a concept or test an idea quickly. That, and Google's promotion of Colab means that ideas can be shared quite freely. Training, visualizing and debugging models is very easy in TensorFlow, compared to other platforms (especially the good old Caffe days). In terms of productionizing, it's a bit of a mixed bag. In our case, most of our feature building is performed via Apache Spark. This means having to convert Parquet (columnar optimized) files to a TensorFlow friendly format i.e., protobufs. The lack of good JVM bindings mean that our projects end up being a mix of Python and Scala. This makes it hard to reuse some of the tooling and support we wrote in Scala. This is where MXNet shines better (though its Scala API could do with more work).
DataRobot helps, with algorithms, to analyze and decipher numerous machine-learning techniques in order to provide models to assist in company-wide decision making.
Our DataRobot program puts on an "even playing field" the strength of auto-machine learning and allows us to make decisions in an extremely timely manner. The speed is consistent without being offset by errors or false-negatives.
It encompasses many desired techniques that help companies in general, to reconfigure in to artificial intelligence driven firms, with little to no inconvenience.
The platform itself is very complicated. It probably can't function well without being complicated, but there is a big training curve to get over before you can effectively use it. Even I'm not sure if I'm effectively using it now.
The suggested model DataRobot deploys often not the best model for our purposes. We've had to do a lot of testing to make sure what model is the best. For regressive models, DataRobot does give you a MASE score but, for some reason, often doesn't suggest the best MASE score model.
The software will give you errors if output files are not entered correctly but will not exactly tell you how to fix them. Perhaps that is complicated, but being able to download a template with your data for an output file in the correct format would be nice.
Theano is perhaps a bit faster and eats up less memory than TensorFlow on a given GPU, perhaps due to element-wise ops. Tensorflow wins for multi-GPU and “compilation” time.
DataRobot presents a machine-learning platform designed by data scientists from an array of backgrounds, to construct and develop precise predictive modeling in a fraction of the time previously taken. The tech invloved addresses the critical shortage of data scientists by changing the speed and economics of predictive analytics. DataRobot utilizes parallel processing to evaluate models in R, Python, Spark MLlib, H2O and other open source databases. It searches for possible permutations and algorithms, features, transformation, processes, steps and tuning to yield the best models for the dataset and predictive goal.
As I am writing this report I am participating with Datarobot Engineers in an complex environment and we have their whole support. We are in Mexico and is not common to have this commitment from companies without expensive contract services. Installing is on premise and the client does not want us to take control and they, the client, is also limited because of internal IT regulations ,,, soo we are just doing magic and everybody is committed.
Community support for TensorFlow is great. There's a huge community that truly loves the platform and there are many examples of development in TensorFlow. Often, when a new good technique is published, there will be a TensorFlow implementation not long after. This makes it quick to ally the latest techniques from academia straight to production-grade systems. Tooling around TensorFlow is also good. TensorBoard has been such a useful tool, I can't imagine how hard it would be to debug a deep neural network gone wrong without TensorBoard.
I've done machine learning through python before, however having to code and test each model individually was very time consuming and required a lot of expertise. The data Robot approach, is an excellent way of getting to a well placed starting point. You can then pick up the model from there and fine tune further if you need.
Keras is built on top of TensorFlow, but it is much simpler to use and more Python style friendly, so if you don't want to focus on too many details or control and not focus on some advanced features, Keras is one of the best options, but as far as if you want to dig into more, for sure TensorFlow is the right choice