Likelihood to Recommend If you need a managed big data megastore, which has native integration with highly optimized
Apache Spark Engine and native integration with MLflow, go for Databricks Lakehouse Platform. The Databricks Lakehouse Platform is a breeze to use and analytics capabilities are supported out of the box. You will find it a bit difficult to manage code in notebooks but you will get used to it soon.
Read full review TensorFlow is great for most deep learning purposes. This is especially true in two domains: 1. Computer vision: image classification, object detection and image generation via generative adversarial networks 2. Natural language processing: text classification and generation. The good community support often means that a lot of off-the-shelf models can be used to prove a concept or test an idea quickly. That, and Google's promotion of Colab means that ideas can be shared quite freely. Training, visualizing and debugging models is very easy in TensorFlow, compared to other platforms (especially the good old Caffe days). In terms of productionizing, it's a bit of a mixed bag. In our case, most of our feature building is performed via Apache Spark. This means having to convert Parquet (columnar optimized) files to a TensorFlow friendly format i.e., protobufs. The lack of good JVM bindings mean that our projects end up being a mix of Python and Scala. This makes it hard to reuse some of the tooling and support we wrote in Scala. This is where MXNet shines better (though its Scala API could do with more work).
Read full review Pros Process raw data in One Lake (S3) env to relational tables and views Share notebooks with our business analysts so that they can use the queries and generate value out of the data Try out PySpark and Spark SQL queries on raw data before using them in our Spark jobs Modern day ETL operations made easy using Databricks. Provide access mechanism for different set of customers Read full review A vast library of functions for all kinds of tasks - Text, Images, Tabular, Video etc. Amazing community helps developers obtain knowledge faster and get unblocked in this active development space. Integration of high-level libraries like Keras and Estimators make it really simple for a beginner to get started with neural network based models. Read full review Cons Connect my local code in Visual code to my Databricks Lakehouse Platform cluster so I can run the code on the cluster. The old databricks-connect approach has many bugs and is hard to set up. The new Databricks Lakehouse Platform extension on Visual Code, doesn't allow the developers to debug their code line by line (only we can run the code). Maybe have a specific Databricks Lakehouse Platform IDE that can be used by Databricks Lakehouse Platform users to develop locally. Visualization in MLFLOW experiment can be enhanced Read full review RNNs are still a bit lacking, compared to Theano. Cannot handle sequence inputs Theano is perhaps a bit faster and eats up less memory than TensorFlow on a given GPU, perhaps due to element-wise ops. Tensorflow wins for multi-GPU and “compilation” time. Read full review Usability Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured. in terms of graph generation and interaction it could improve their UI and UX
Read full review Support of multiple components and ease of development.
Read full review Support Rating One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
Read full review Community support for TensorFlow is great. There's a huge community that truly loves the platform and there are many examples of development in TensorFlow. Often, when a new good technique is published, there will be a TensorFlow implementation not long after. This makes it quick to ally the latest techniques from academia straight to production-grade systems. Tooling around TensorFlow is also good. TensorBoard has been such a useful tool, I can't imagine how hard it would be to debug a deep neural network gone wrong without TensorBoard.
Read full review Implementation Rating Use of cloud for better execution power is recommended.
Read full review Alternatives Considered Compared to
Synapse &
Snowflake , Databricks provides a much better development experience, and deeper configuration capabilities. It works out-of-the-box but still allows you intricate customisation of the environment. I find Databricks very flexible and resilient at the same time while
Synapse and
Snowflake feel more limited in terms of configuration and connectivity to external tools.
Read full review Keras is built on top of TensorFlow, but it is much simpler to use and more Python style friendly, so if you don't want to focus on too many details or control and not focus on some advanced features,
Keras is one of the best options, but as far as if you want to dig into more, for sure TensorFlow is the right choice
Read full review Return on Investment The ability to spin up a BIG Data platform with little infrastructure overhead allows us to focus on business value not admin DB has the ability to terminate/time out instances which helps manage cost. The ability to quickly access typical hard to build data scenarios easily is a strength. Read full review Learning is s bit difficult takes lot of time. Developing or implementing the whole neural network is time consuming with this, as you have to write everything. Once you have learned this, it make your job very easy of getting the good result. Read full review ScreenShots