Amazon SageMaker enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Amazon SageMaker removes all the barriers that typically slow down developers who want to use machine learning.
N/A
SAP Predictive Analytics
Score 7.0 out of 10
N/A
SAP Predictive Analytics is, as the name would suggest, a statistical analysis and data mining platform that can be deployed with SAP HANA.
N/A
TensorFlow
Score 7.7 out of 10
N/A
TensorFlow is an open-source machine learning software library for numerical computation using data flow graphs. It was originally developed by Google.
(Couldn't pick R from the list nor Python packages)
Actually, I don't see SAP Predictive Analytics stacking up against other tools, but rather complementing them. On one side why would we use something "more complex" to solve a "business as usual" problem, when you can use tools …
It allows for one-click processes and for things to be auto checked before they are moved through the process but through the system. It also makes training easy. I am able to train users on the basic fundamentals of the tool and how it is used very easily as it is fully managed on its own which is incredible.
It's a great tool to merge actual data analysis (which Lumira doesn't do that well) with visualization (which Lumira does well) - so it can be seen as Lumira for data analysts. However, a lot of the 'predictive' side is hidden/black box which can be frustrating for those analysts, so you could argue it is too complex for casual users, but too 'black box' for analysts.
TensorFlow is great for most deep learning purposes. This is especially true in two domains: 1. Computer vision: image classification, object detection and image generation via generative adversarial networks 2. Natural language processing: text classification and generation. The good community support often means that a lot of off-the-shelf models can be used to prove a concept or test an idea quickly. That, and Google's promotion of Colab means that ideas can be shared quite freely. Training, visualizing and debugging models is very easy in TensorFlow, compared to other platforms (especially the good old Caffe days). In terms of productionizing, it's a bit of a mixed bag. In our case, most of our feature building is performed via Apache Spark. This means having to convert Parquet (columnar optimized) files to a TensorFlow friendly format i.e., protobufs. The lack of good JVM bindings mean that our projects end up being a mix of Python and Scala. This makes it hard to reuse some of the tooling and support we wrote in Scala. This is where MXNet shines better (though its Scala API could do with more work).
It doesn't require you to have a Ph.D. to build models!
You can use it to address a very large and wide dataset without worrying about sampling.
Automation is in the product DNA. You can prepare your data, ingest it into the "Kernel", then get insights about what was found, decide to publish it and schedule scoring tasks or model refresh in the same product.
It's very good for the hardcore programmer, but a little bit complex for a data scientist or new hire who does not have a strong programming background.
Most of the popular library and ML frameworks are there, but we still have to depend on them for new releases.
Theano is perhaps a bit faster and eats up less memory than TensorFlow on a given GPU, perhaps due to element-wise ops. Tensorflow wins for multi-GPU and “compilation” time.
The documentation provides an explanation about what features are available but not necessarily what's happening behind the scenes. On the other side, the "community" has grown since the acquisition and most questions are properly addressed by SAP folks. Since the "product maintenance" mode announcement was made, there wasn't much new content published except on the Smart Predict side (which is built by the SAP Predictive Analytics team)
Community support for TensorFlow is great. There's a huge community that truly loves the platform and there are many examples of development in TensorFlow. Often, when a new good technique is published, there will be a TensorFlow implementation not long after. This makes it quick to ally the latest techniques from academia straight to production-grade systems. Tooling around TensorFlow is also good. TensorBoard has been such a useful tool, I can't imagine how hard it would be to debug a deep neural network gone wrong without TensorBoard.
Amazon SageMaker took the heavy lifting out of building and creating models. It allowed for our organization to use our current system for integration and essentially added on a feature to help all levels of Data scientists and IT professionals in our department and company as a whole. The training was simple as well.
We have typically used Spotfire for data analysis but decided to move to SAP Business Objects due to its innate connection with SAP. I found Lumira to be good for visualizations but it is not meant for data analysis. Therefore, we have introduced Predictive Analytics to see if it can fill that gap. So far, it's been far less intuitive than Spotfire to get started, and as far as I am aware so far, it does not bring many additional capabilities. I do, however, like that it utilizes the Lumira look/feel and integrates very well.
Keras is built on top of TensorFlow, but it is much simpler to use and more Python style friendly, so if you don't want to focus on too many details or control and not focus on some advanced features, Keras is one of the best options, but as far as if you want to dig into more, for sure TensorFlow is the right choice