Microsoft's Azure Data Factory is a service built for all data integration needs and skill levels. It is designed to allow the user to easily construct ETL and ELT processes code-free within the intuitive visual environment, or write one's own code. Visually integrate data sources using more than 80 natively built and maintenance-free connectors at no added cost. Focus on data—the serverless integration service does the rest.
N/A
Databricks Data Intelligence Platform
Score 8.8 out of 10
N/A
Databricks offers the Databricks Lakehouse Platform (formerly the Unified Analytics Platform), a data science platform and Apache Spark cluster manager. The Databricks Unified Data Service provides a platform for data pipelines, data lakes, and data platforms.
$0.07
Per DBU
Snowflake
Score 8.7 out of 10
N/A
The Snowflake Cloud Data Platform is the eponymous data warehouse with, from the company in San Mateo, a cloud and SQL based DW that aims to allow users to unify, integrate, analyze, and share previously siloed data in secure, governed, and compliant ways. With it, users can securely access the Data Cloud to share live data with customers and business partners, and connect with other organizations doing business as data consumers, data providers, and data service providers.
Databricks [Lakehouse Platform (Unified Analytics Platform)] can work with all data types in their original format while Snowflake requires additional structures to fit the data before loading it. Databricks is open source so potential is far greater.
Compared to Synapse & Snowflake, Databricks provides a much better development experience, and deeper configuration capabilities. It works out-of-the-box but still allows you intricate customisation of the environment. I find Databricks very flexible and resilient at the same …
Databricks is a true all-in-one platform, and at the time of implementation, it had more features available to us, making it a clear choice over Snowflake. Moving our workloads from local computing to the servers in Databricks gave our start-up staff a great quality of life …
We use these tools for applications they are better suited for vs a Snowflake. For e.g. MS Fabric has powerful agentic AI capabilities; Redshift is our go to choice for the TMT vertical within the organization and Databricks is the default choice for AI/ML applications.
We particularly liked Snowflake's security model as well as its unique storage (whereby everything is essentially a pointer to immutable micro-partitions, which is the key behind its zero-copy cloning, its secure sharing, its time travel, etc.). and also how it separates …
Snowflake is much faster and easier to write queries and pull data. But the visualization part of Snowflake is not as good as them. Also, Snowflake only supports SQL queries but not python or other languages. So basically Snowflake is the expert in its field but not suitable …
I evaluated Redshift and Panoply when making the choice for Snowflake. Panoply is built on Redshift, so the two are equal in drawbacks: Redshift requires a cluster to be running 24/7 for your data to live there. We produce terabytes of data every day, so this was not an option …
Best scenario is for ETL process. The flexibility and connectivity is outstanding. For our environment, SAP data connectivity with Azure Data Factory offers very limited features compared to SAP Data Sphere. Due to the limited modelling capacity of the tool, we use Databricks for data modelling and cleaning. Usage of multiple tools could have been avoided if adf has modelling capabilities.
Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.
Snowflake is well suited when you have to store your data and you want easy scalability and increase or decrease the storage per your requirement. You can also control the computing cost, and if your computing cost is less than or equal to 10% of your storage cost, then you don't have to pay for computing, which makes it cost-effective as well.
Snowflake scales appropriately allowing you to manage expense for peak and off peak times for pulling and data retrieval and data centric processing jobs
Snowflake offers a marketplace solution that allows you to sell and subscribe to different data sources
Snowflake manages concurrency better in our trials than other premium competitors
Snowflake has little to no setup and ramp up time
Snowflake offers online training for various employee types
Granularity of Errors: Sometimes, Azure Data Factory provides error messages that are too generic or vague for us, making it challenging to pinpoint the exact cause of a pipeline failure. Enhanced error messages with more actionable details would greatly assist us as users in debugging their pipelines.
Pipeline Design UI: In my experience, the visual interface for designing pipelines, especially when dealing with complex workflows or numerous activities, can become cluttered. I think a more intuitive and scalable design interface would improve usability. In my opinion, features like zoom, better alignment tools, or grouping capabilities could make managing intricate designs more manageable.
Native Support: While Azure Data Factory does support incremental data loads, in my experience, the setup can be somewhat manual and complex. I think native and more straightforward support for Change Data Capture, especially from popular databases, would simplify the process of capturing and processing only the changed data, making regular data updates more efficient
Do not force customers to renew for same or higher amount to avoid loosing unused credits. Already paid credits should not expire (at least within a reasonable time frame), independent of renewal deal size.
SnowFlake is very cost effective and we also like the fact we can stop, start and spin up additional processing engines as we need to. We also like the fact that it's easy to connect our SQL IDEs to Snowflake and write our queries in the environment that we are used to
So far product has performed as expected. We were noticing some performance issues, but they were largely Synapse related. This has led to a shift from Synapse to Databricks. Overall this has delayed our analytic platform. Once databricks becomes fully operational, Azure Data Factory will be critical to our environment and future success.
Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured.
in terms of graph generation and interaction it could improve their UI and UX
Because the fact that you can query tons of data in a few seconds is incredible, it also gives you a lot of functions to format and transform data right in your query, which is ideal when building data models in BI tools like Power BI, it is available as a connector in the most used BI tools worldwide.
We have not had need to engage with Microsoft much on Azure Data Factory, but they have been responsive and helpful when needed. This being said, we have not had a major emergency or outage requiring their intervention. The score of seven is a representation that they have done well for now, but have not proved out their support for a significant issue
One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
We have had terrific experiences with Snowflake support. They have drilled into queries and given us tremendous detail and helpful answers. In one case they even figured out how a particular product was interacting with Snowflake, via its queries, and gave us detail to go back to that product's vendor because the Snowflake support team identified a fault in its operation. We got it solved without lots of back-and-forth or finger-pointing because the Snowflake team gave such detailed information.
Azure Data Factory helps us automate to schedule jobs as per customer demands to make ETL triggers when the need arises. Anyone can define the workflow with the Azure Data Factory UI designer tool and easily test the systems. It helped us automate the same workflow with programming languages like Python or automation tools like ansible. Numerous options for connectivity be it a database or storage account helps us move data transfer to the cloud or on-premise systems.
The most important differentiating factor for Databricks Lakehouse Platform from these other platforms is support for ACID transactions and the time travel feature. Also, native integration with managed MLflow is a plus. EMR, Cloudera, and Hortonworks are not as optimized when it comes to Spark Job Execution. Other platforms need to be self-managed, which is another huge hassle.
I have had the experience of using one more database management system at my previous workplace. What Snowflake provides is better user-friendly consoles, suggestions while writing a query, ease of access to connect to various BI platforms to analyze, [and a] more robust system to store a large amount of data. All these functionalities give the better edge to Snowflake.