Microsoft's Blob Storage system on Azure is designed to make unstructured data available to customers anywhere through REST-based object storage.
$0.01
per GB/per month
Azure Data Factory
Score 8.4 out of 10
N/A
Microsoft's Azure Data Factory is a service built for all data integration needs and skill levels. It is designed to allow the user to easily construct ETL and ELT processes code-free within the intuitive visual environment, or write one's own code. Visually integrate data sources using more than 80 natively built and maintenance-free connectors at no added cost. Focus on data—the serverless integration service does the rest.
In Azure, it is the storage to use, and in my view, the Blob Storage offers more, or finer-grained configuration options, than S3. So my recommendation would be to check in detail what is offered. As the Blob Storage is more or less a Microsoft exclusive product, the "interoperability" is more limited than, for example, with S3. The S3 is more widely adopted, and if you cannot exclude a migration scenario from one cloud provider to another, additional effort is needed.
Well-suited Scenarios for Azure Data Factory (ADF): When an organization has data sources spread across on-premises databases and cloud storage solutions, I think Azure Data Factory is excellent for integrating these sources. Azure Data Factory's integration with Azure Databricks allows it to handle large-scale data transformations effectively, leveraging the power of distributed processing. For regular ETL or ELT processes that need to run at specific intervals (daily, weekly, etc.), I think Azure Data Factory's scheduling capabilities are very handy. Less Appropriate Scenarios for Azure Data Factory: Real-time Data Streaming - Azure Data Factory is primarily batch-oriented. Simple Data Copy Tasks - For straightforward data copy tasks without the need for transformation or complex workflows, in my opinion, using Azure Data Factory might be overkill; simpler tools or scripts could suffice. Advanced Data Science Workflows: While Azure Data Factory can handle data prep and transformation, in my experience, it's not designed for in-depth data science tasks. I think for advanced analytics, machine learning, or statistical modeling, integration with specialized tools would be necessary.
It allows copying data from various types of data sources like on-premise files, Azure Database, Excel, JSON, Azure Synapse, API, etc. to the desired destination.
We can use linked service in multiple pipeline/data load.
It also allows the running of SSIS & SSMS packages which makes it an easy-to-use ETL & ELT tool.
Blob storage is fairly simple, with several different options/settings that can be configured. The file explorer has enhanced its usability. Some areas could be improved, such as providing more details or stats on how many times a file has been accessed. It is an obvious choice if you're already using Azure/Entra.
So far product has performed as expected. We were noticing some performance issues, but they were largely Synapse related. This has led to a shift from Synapse to Databricks. Overall this has delayed our analytic platform. Once databricks becomes fully operational, Azure Data Factory will be critical to our environment and future success.
Microsoft has improved its customer service standpoint over the years. The ability to chat with an issue, get a callback, schedule a call or work with an architecture team(for free) is a huge plus. I can get mentorship and guidance on where to go with my environment without pushy sales tactics. This is very refreshing. Typically support can get me to where I need to be on the first contact, which is also nice.
We have not had need to engage with Microsoft much on Azure Data Factory, but they have been responsive and helpful when needed. This being said, we have not had a major emergency or outage requiring their intervention. The score of seven is a representation that they have done well for now, but have not proved out their support for a significant issue
Azure Blob Storage was used only because we were already using it for other projects, and it has a good reputation for being a reliable cloud provider. It also has widespread regional availability and allows for data replication. It can also be easily accessed via the API or by the console, which makes it a solid, user-friendly option.
The easy integration with other Microsoft software as well as high processing speed, very flexible cost, and high level of security of Microsoft Azure products and services stack up against other similar products.
Azure Blob Storage is just way cheaper than anything we could afford to do on-prem. Forecasting spend is way easier with predictable growth than it is with large capital expenditures every few years, and that ability to grow or shrink dynamically is simplifies things.