TrustRadius: an HG Insights company

Azure Data Factory

Score8.1 out of 10

67 Reviews and Ratings

What is Azure Data Factory?

Microsoft's Azure Data Factory is a service built for all data integration needs and skill levels. It is designed to allow the user to easily construct ETL and ELT processes code-free within the intuitive visual environment, or write one's own code. Visually integrate data sources using more than 80 natively built and maintenance-free connectors at no added cost. Focus on data—the serverless integration service does the rest.

Categories & Use Cases

Top Performing Features

  • Connect to traditional data sources

    Ability to connect to traditional data sources like relational databases, flat files, XML files and packaged applications

    Category average: 8.9

  • Simple transformations

    Simple data transformations are calculations, data type conversions, aggregations and search and replace operations

    Category average: 8.9

  • Connecto to Big Data and NoSQL

    Ability to connect to non-traditional data sources like Hadoop and other big data technologies, and NoSQL databases

    Category average: 7.6

Areas for Improvement

  • Metadata management

    Automated discovery of metadata with ability to synchronize and share metadata with other tools like Master Data Management

    Category average: 7.5

  • Data model creation

    Ability to create and maintain data models using a graphical tool to define relationships between data

    Category average: 8.4

  • Integration with data quality tools

    Integration with tools for cleansing, parsing and normalizing data according to business rules

    Category average: 7.4

Azure Data Factory an Universal pipe

Use Cases and Deployment Scope

We live in a world where half of the data for analytics come from SAP and half from non SAP sources. We use Azure Data Factory to load non SAP data from different source systems into Azure lake house. The project follows medallion architecture where Azure Data Factory takes data from multiple sources and stores them in the bronze layer of the medallion architecture. Since our SAP datasphere has limitations connecting to non SAP sources as good and native like Azure Data Factory, we use Azure Data Factory for these scenarios. Further modelling of data in the next layers (silver layer and gold layer) is done using Azure Data Bricks, where the final data product is created. The Azure Data Factory also helps in applying transformations which loading the data from different source systems. Datasphere often relies on ODBC/JDBC/OData connectivity, whereas Azure Data Factory provides maintenance-free connectors for our web applications, like partner portal, cloud applications like one crm, on-prem Oracle systems, and also to NoSQL dbs like MongoDB. To summarize Azure Data Factory is used in our organisation to ingest non SAP data from different sources into our Bronze layer for the Databricks to further clean and curate the data for data product creation. Without Azure data factory connecting the data from different source wouldnt been possible because SAP Datasphere has limitations when it comes to connection with non SAP source systems

Pros

  • Connectivity with other cloud environment like Salesforce
  • Connectivity with non structured data and big data systems
  • Reduces data islands
  • Azure Data Factory handles perfectly the huge volume of data in JSON format from our global apps and services

Cons

  • The error details where there is an error while processing the files is not clear
  • Connectivity with s4 system is not so good compared to Datasphere
  • Since Azure Data Factory just transfers data it lacks the capacity to identify the wrongness in the data. It is just a dumb data transfer tool from point A to B

Return on Investment

  • Drag and drop interface is a positive feature which allows end users to create data pipelines
  • The advantage of no or low code is causing spaghetti situation sometimes
  • Cost efficient as this is serverless and use only for pay is a interesting
  • No meta data or governance of data

Usability

Alternatives Considered

SAP BW/4HANA, SAP Business Warehouse, SAP Datasphere, Informatica PowerCenter (legacy) and Anaplan

Other Software Used

SAP Datasphere, Anaplan, Azure Databricks

Azure Data Factory - data integration tool to build your Cloud Data Platform.

Use Cases and Deployment Scope

Azure Data Factory is a data integration technology used to integrate data to the cloud, especially. Azure cloud. Today, any organization wants to integrate its data into a cloud data platform. This cloud platform serves data for various purposes, such as sharing it with downstream applications, developing analytics, and even building AI applications on it. Azure Data Factory integrates data from various sources, such as on-prem applications, databases, and files, into this data platform.

Pros

  • Data Ingestion - it works very well with numerous data sources.
  • Data pipeline orchestration: It is a generic, popular tool for orchestrating data pipelines.
  • Works well in Azure ecosystem, Azure services and data platforms like Databricks.
  • It is a serverless and scalable solution for cloud data integration.

Cons

  • Data transformation has provided a data flow. But it is not ideal for complex data transformation.
  • Cost of Data Factory depends on number of pipelines and transformations used.
  • Azure Data Factory is efficient and good for parallel data pipeline runs. But not ideal for a large volume of data.

Return on Investment

  • Reduces the cost of movement of the data to the cloud.
  • Efficient tool monitoring and maintenance of the cloud data platform.
  • No code, low code.

Usability

Alternatives Considered

Informatica Cloud Data Quality

Other Software Used

Azure Databricks, GitLab, Oracle Data Integrator (ODI)

Overall helpful product that works as advertised.

Use Cases and Deployment Scope

Using SHIR to pull records from on-premise databases and storing in ADLS storage. From ADLS storage, bringing data into databricks for analytics use. Roughly 50 different pipelines in each environment, with 3 separate environments. Code is stored and deployed from Azure Dev ops. Alerting is handled via LogicMonitor and Azure Functions.

Pros

  • Step by step processes.
  • Storing infrastructure as code.
  • Alerting on job failures.
  • SHIR

Cons

  • Learning curve for pipeline creation interface.
  • Alerting isn't necessarily built in. Had to work around this to meet team needs.
  • With GIT enabled, some features can only be done via git, while some need to be done via the portal.

Return on Investment

  • Still working on ROI. Development is ongoing after some non-Azure Data Factory related changes.

Usability

Other Software Used

Microsoft Exchange, Microsoft Azure, Microsoft Azure Key Vault, Nerdio

One of the best and reliable ETL & ELT platforms for pulling data from multiple sources

Use Cases and Deployment Scope

One of the best Data Integration tools for both ETL and ELT. I have been using ADF for the last 6+ years and it helped me in extracting several data feeds within our organization that meets our specific business needs. The tool provides many features such as Move and Transform, Data explorer, Azure Functions, Data bricks, Data Lake Analytics, Blob Storage, Linked services, Machine Learning, and Power Query.

Pros

  • It allows copying data from various types of data sources like on-premise files, Azure Database, Excel, JSON, Azure Synapse, API, etc. to the desired destination.
  • We can use linked service in multiple pipeline/data load.
  • It also allows the running of SSIS & SSMS packages which makes it an easy-to-use ETL & ELT tool.

Cons

  • For complex JSON when it comes to mapping nested attribute it's not easy to flatten out
  • Data Factory V1 does not have a good implementation experience as compared to V2
  • Work with on premise solutions sometimes is not too friendly because you will need to set a VPN

Return on Investment

  • ADF makes the whole ETL process very simple and manageable.
  • It saves a lot of cost and time.
  • Solving the data ingestions with ELT approach.
  • Storage compaction format help us a lot when dealing with Bigdata problems.

Alternatives Considered

AWS Glue

Other Software Used

Fivetran, Talend Data Integration, Informatica PowerCenter

Azure Databricks

Use Cases and Deployment Scope

Orchestration platform for the Databricks notebooks. Have used an ETL for loading csv files into SQL server based database.

Pros

  • Orchestration engine
  • Low code Data pipeline
  • Logic apps integration

Cons

  • Error Flagging, Details of the error code is not specific especially faced this during Azure Table load
  • Missing feature of Data exploration functionality similar to Synapse Data explorer
  • missing access to orchestrate/create stream analytics job

Return on Investment

  • No Code / Low Code Easier development
  • Easier Orchestration Platform
  • Lot of different services available for plug in connect

Alternatives Considered

Azure Synapse Analytics (Azure SQL Data Warehouse) and Oracle Data Integrator

Other Software Used

Azure Synapse Analytics (Azure SQL Data Warehouse), Databricks Lakehouse Platform (Unified Analytics Platform), Azure Blob Storage