Skip to main content
TrustRadius
Apache Airflow

Apache Airflow

Overview

What is Apache Airflow?

Apache Airflow is an open source tool that can be used to programmatically author, schedule and monitor data pipelines using Python and SQL. Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s…

Read more

Learn from top reviewers

Return to navigation

Pricing

View all pricing
N/A
Unavailable

What is Apache Airflow?

Apache Airflow is an open source tool that can be used to programmatically author, schedule and monitor data pipelines using Python and SQL. Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s Incubator Program 2016 and announced as Top…

Entry-level set up fee?

  • No setup fee

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services

Would you like us to let the vendor know that you want pricing?

30 people also want pricing

Alternatives Pricing

N/A
Unavailable
What is Control-M?

Control-M from BMC is a platform for integrating, automating, and orchestrating application and data workflows in production across complex hybrid technology ecosystems. It provides deep operational capabilities, delivering speed, scale, security, and governance.

What is Appy Pie?

Appy Pie is a diversified no-code development platform. It offers app and web development, helpdesk support, chatbot building, design features, and integration that are helpful when starting, running, or growing a business.

Return to navigation

Product Demos

Getting Started with Apache Airflow

YouTube

Apache Airflow | Build your custom operator for twitter API

YouTube
Return to navigation

Features

Workload Automation

Workload automation tools manage event-based scheduling and resource management across a wide variety of applications, databases and architectures

9.8
Avg 8.5
Return to navigation

Product Details

What is Apache Airflow?

Apache Airflow Video

What's coming in Airflow 2.0?

Apache Airflow Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo

Frequently Asked Questions

Apache Airflow is an open source tool that can be used to programmatically author, schedule and monitor data pipelines using Python and SQL. Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s Incubator Program 2016 and announced as Top-Level Apache Project in 2019. It is used as a data orchestration solution, with over 140 integrations and community support.

Reviewers rate Multi-platform scheduling and Central monitoring highest, with a score of 10.

The most common users of Apache Airflow are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews From Top Reviewers

(1-5 of 6)

Apache Airflow

Rating: 8 out of 10
April 04, 2022
PM
Vetted Review
Verified User
Apache Airflow
1 year of experience
We are using apache airflow for managing the ETL pipelines. We are using programmatically to monitor the data pipeline. I have been helping the data team in creating the pipeline using apache airflow.
  • We are using for the workflow management system
  • managing the etl pipelines.
  • We can manage the task scheduling as code & need not monitor as there is no data in & out.
Cons
  • they should bring in some time based scheduling too not only event based
  • they do not store the metadata due to which we are not able to analyze the workflows
  • they only support python as of now for scripted pipeline writing
We were using it for managing the workflows for the etl pipelines as code so Airflow came as very helpful.

We used it to manage processes for etl pipelines

Rating: 9 out of 10
July 05, 2022
Vetted Review
Verified User
Apache Airflow
3 years of experience
We use Apache Airflow to streamline the data pipelines, create workflows according to the needs of the project and overall monitoring of the functionality itself. In addition, we are using Apache Airflow to solve the problem of retrieving data from Hive before creating the workflow in its entirety. It's also utilized for automation.
  • In charge of the ETL processes.
  • As there is no incoming or outgoing data, we may handle the scheduling of tasks as code and avoid the requirement for monitoring.
Cons
  • There is no way to assess the processes because they do not keep the metadata.
  • Python is currently the only language supported for creating programmed pipelines.
  • They need to implement both event-based and time-based scheduling.
I handle our pipeline scheduling and monitoring. I had minimal problems with Apache Airflow. It's well-suited for data engineers who are responsible for the creation of the data workflows. It is also best suited for the scheduling of the workflow; it allows us to execute Python scripts as well. Finally, Apache Airflow is best suited for the circumstances in which we need a scalable solution.

A very nice job scheduler of DAGs that could become even better

Rating: 7 out of 10
June 13, 2022
NG
Vetted Review
Verified User
Apache Airflow
3 years of experience
We use apache Airflow in GCP as part of Cloud Composer to run all our ETL jobs.
  • schedule jobs
  • graphing job flow and dependencies and retries
  • Nice UI for visualization
Cons
  • Instead of using a Storage bucket as a source, will be nice if the DAGs can be pulled by a private git repo directly
  • Upgrade process could be smoother
If you are using GCP, you can use Apache Airflow very easily by using Cloud Composer which is the managed service for Airflow. If you need to deploy it yourself, installation and setup could be tricky.

Apache Airflow for Automation and scheduling

Rating: 9 out of 10
May 05, 2022
Vetted Review
Verified User
Apache Airflow
3 years of experience
We are using Apache Airflow for streamline the data pipelines, creating the workflow, Schedule the workflow as per the need, and also monitor the same, we are solving the problem of fetching the data from hive and then created the complete workflow and also we are using for automation as well.
  • Smart Automation
  • Highly Scalable
  • Complex Workflow
  • Easy Integration with other system
Cons
  • Documentation part
  • GUI can be improved
  • Reliability issues
Apache Airflow is best suited for the data engineers for creating the data workflows, and it is best suitable for the scheduling the workflow and also we can run the python codes as well using apache airflow, and it is suited for the situation where we need scalable solution. Monitoring can be done easily.

Apache Airflow software

Rating: 9 out of 10
June 21, 2022
Vetted Review
Verified User
Apache Airflow
1 year of experience
Apache Airflow is used for the scheduling and orchestration of data pipelines or workflows. Orchestration of data pipelines refers to the sequencing, coordination, scheduling, and managing of complex data pipelines from diverse sources. It is also helpful when your data pipelines change slowly (days or weeks – not hours or minutes), are related to a specific time interval, or are pre-scheduled.
  • Scheduling of data pipelines or workflows.
  • Orchestration of data pipelines or workflows.
Cons
  • Not intuitive for new users.
  • Setting up Airflow architecture for production is NOT easy.
Ease of use—you only need a little python knowledge to get started. Open-source community—Airflow is free and has a large community of active users. Apache Airflow is used for the scheduling and orchestration of data pipelines or workflows. Orchestration of data pipelines refers to the sequencing, coordination, scheduling, and managing of complex data pipelines from diverse sources.
Return to navigation