Dataflow Eliminating ETL Infrastructure Overhead
February 06, 2026

Dataflow Eliminating ETL Infrastructure Overhead

Anonymous | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with Google Cloud Dataflow

We use Google Cloud Dataflow as the primary ETL engine for our billing application. Our architecture ingests raw financial data stored in Cloud Storage (Excel format), which is then processed via Dataflow pipelines to handle data cleansing, schema mapping, and validation. We use Google Cloud Dataflow's batch processing to transform this unstructured data into structured datasets within BigQuery. This automatically triggers a generation of new invoice and keeps it ready for download.

Pros

  • We require exactly once processing for our invoices where accuracy is very important.
  • The native connectors for Bigquery and Storage and BQtoStorage templates made our job easy as we didn't have to write custom templates.
  • We chose Google Cloud Dataflow because of the unified stream and batch processing capabilities. As we are working on stream processing for data we get from Google in Billing Exports.

Cons

  • More templates for Bigquery and App Engine. There is only limited options for templates so the things we use can limit.
  • I would like native connectors for Excel (XLSX) to reduce the need for custom wrappers in financial pipelines.
  • Debugging Google Cloud Dataflow using only logs in Cloud Logging can be overwhelming sometimes, and it’s not always obvious which specific element in the flow caused a failure. IT uses a lot of time.
  • IT has automated our workflow and data enrichment steps which were very resource and time hungry steps.
  • Unlike traditional ETL tools that require a 24/7 server, Dataflow scales to zero when there are no files are in GCS which is very important for us.
  • With the Apache Beam SDK you can write a pipeline once and handle the entire GCS-to-BigQuery flow.
It really saved a lot of time and it's flexibility really can give you infra which is future-proof for most of the use cases may it be streaming or batch data. And with this you can avoid use of resource-heavy big data offerings.

Do you think Google Cloud Dataflow delivers good value for the price?

Yes

Are you happy with Google Cloud Dataflow's feature set?

Yes

Did Google Cloud Dataflow live up to sales and marketing promises?

Yes

Did implementation of Google Cloud Dataflow go as expected?

Yes

Would you buy Google Cloud Dataflow again?

Yes

It is best in cases where you have batch as well as streaming data. Also in some cases where you have batch data right now and in future you will get streaming data. In those cases Dataflow is very good.
Also in cases where most of your infra is on GCP.
It might not be good when you already are on AWS or Azure. And also you want in-depth control over security and management. Then you can directly use Apache beam over Dataflow.

Google Cloud Dataflow Feature Ratings

Real-Time Data Analysis
8
Visualization Dashboards
5
Data Ingestion from Multiple Data Sources
9
Low Latency
9
Integrated Development Tools
6
Data wrangling and preparation
7
Linear Scale-Out
8
Machine Learning Automation
6
Data Enrichment
8

Comments

More Reviews of Google Cloud Dataflow