AWS Glue vs. Google Cloud Dataflow

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
AWS Glue
Score 8.6 out of 10
N/A
AWS Glue is a managed extract, transform, and load (ETL) service designed to make it easy for customers to prepare and load data for analytics. With it, users can create and run an ETL job in the AWS Management Console. Users point AWS Glue to data stored on AWS, and AWS Glue discovers data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, data is immediately searchable, queryable, and available for ETL.
$0.44
billed per second, 1 minute minimum
Google Cloud Dataflow
Score 8.8 out of 10
N/A
Google offers Cloud Dataflow, a managed streaming analytics platform for real-time data insights, fraud detection, and other purposes.N/A
Pricing
AWS GlueGoogle Cloud Dataflow
Editions & Modules
per DPU-Hour
$0.44
billed per second, 1 minute minimum
No answers on this topic
Offerings
Pricing Offerings
AWS GlueGoogle Cloud Dataflow
Free Trial
NoNo
Free/Freemium Version
NoNo
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details
More Pricing Information
Community Pulse
AWS GlueGoogle Cloud Dataflow
Considered Both Products
AWS Glue
Chose AWS Glue
We are already in AWS services, so AWS glue is the first choice for us. But for the comparison of ETL job making and process time, it's way faster for other services.
Google Cloud Dataflow

No answer on this topic

Features
AWS GlueGoogle Cloud Dataflow
Streaming Analytics
Comparison of Streaming Analytics features of Product A and Product B
AWS Glue
-
Ratings
Google Cloud Dataflow
7.3
2 Ratings
9% below category average
Real-Time Data Analysis00 Ratings8.02 Ratings
Visualization Dashboards00 Ratings5.01 Ratings
Data Ingestion from Multiple Data Sources00 Ratings9.02 Ratings
Low Latency00 Ratings9.02 Ratings
Integrated Development Tools00 Ratings6.01 Ratings
Data wrangling and preparation00 Ratings7.01 Ratings
Linear Scale-Out00 Ratings8.02 Ratings
Machine Learning Automation00 Ratings6.02 Ratings
Data Enrichment00 Ratings8.02 Ratings
Best Alternatives
AWS GlueGoogle Cloud Dataflow
Small Businesses
IBM SPSS Modeler
IBM SPSS Modeler
Score 9.2 out of 10
IBM Streams (discontinued)
IBM Streams (discontinued)
Score 9.0 out of 10
Medium-sized Companies
IBM InfoSphere Information Server
IBM InfoSphere Information Server
Score 8.0 out of 10
Confluent
Confluent
Score 9.3 out of 10
Enterprises
IBM InfoSphere Information Server
IBM InfoSphere Information Server
Score 8.0 out of 10
Spotfire Streaming
Spotfire Streaming
Score 5.2 out of 10
All AlternativesView all alternativesView all alternatives
User Ratings
AWS GlueGoogle Cloud Dataflow
Likelihood to Recommend
8.8
(10 ratings)
8.0
(1 ratings)
Usability
9.2
(3 ratings)
-
(0 ratings)
Support Rating
7.0
(1 ratings)
-
(0 ratings)
User Testimonials
AWS GlueGoogle Cloud Dataflow
Likelihood to Recommend
Amazon AWS
One of AWS Glue's most notable features that aid in the creation and transformation of data is its data catalog. Support, scheduling, and the automation of the data schema recognition make it superior to its competitors aside from that. It also integrates perfectly with other AWS tools. The main restriction may be integrated with systems outside of the AWS environment. It functions flawlessly with the current AWS services but not with other goods. Another potential restriction that comes to mind is that glue operates on a spark, which means the engineer needs to be conversant in the language.
Read full review
Google
It is best in cases where you have batch as well as streaming data. Also in some cases where you have batch data right now and in future you will get streaming data. In those cases Dataflow is very good. Also in cases where most of your infra is on GCP. It might not be good when you already are on AWS or Azure. And also you want in-depth control over security and management. Then you can directly use Apache beam over Dataflow.
Read full review
Pros
Amazon AWS
  • It is extremely fast, easy, and self-intuitive. Though it is a suite of services, it requires pretty less time to get control over it.
  • As it is a managed service, one need not take care of a lot of underlying details. The identification of data schema, code generation, customization, and orchestration of the different job components allows the developers to focus on the core business problem without worrying about infrastructure issues.
  • It is a pay-as-you-go service. So, there is no need to provide any capacity in advance. So, it makes scheduling much easier.
Read full review
Google
  • Streaming, Real time work load
  • Batch processing
  • Auto scaling
  • flexible pricing
Read full review
Cons
Amazon AWS
  • In-Stream schema registries feature people can not use this more efficiently
  • in Connections feature they can add more connectors as well
  • The crucial problem with AWS Glue is that it only works with AWS.
Read full review
Google
  • More templates for Bigquery and App Engine. There is only limited options for templates so the things we use can limit.
  • I would like native connectors for Excel (XLSX) to reduce the need for custom wrappers in financial pipelines.
  • Debugging Google Cloud Dataflow using only logs in Cloud Logging can be overwhelming sometimes, and it’s not always obvious which specific element in the flow caused a failure. IT uses a lot of time.
Read full review
Usability
Amazon AWS
While easy to set up and manage monitoring for large datasets, its complexity can be a barrier for new users. Integration with AWS Ecosystem, Managed Monitoring, Dashboards and monitoring tools for AWS Glue are generally easy to set up and maintain, Automated Data Pipelines. Automates data pipeline creation, making it efficient for certain data integration
Read full review
Google
It really saved a lot of time and it's flexibility really can give you infra which is future-proof for most of the use cases may it be streaming or batch data. And with this you can avoid use of resource-heavy big data offerings.
Read full review
Support Rating
Amazon AWS
Amazon responds in good time once the ticket has been generated but needs to generate tickets frequent because very few sample codes are available, and it's not cover all the scenarios.
Read full review
Google
No answers on this topic
Alternatives Considered
Amazon AWS
AWS Glue is a fully managed ETL service that automates many ETL tasks, making it easier to set AWS Glue simplifies ETL through a visual interface and automated code generation.
Read full review
Google
Google Cloud Dataproc Cloud Datafusion
Read full review
Return on Investment
Amazon AWS
  • We are using GLUE for our ETL purpose. it’s ease with other our AWS services makes our ROI, 100% ROI.
  • One missing piece was compatibility with other data source for which we found a work around and made our data source as S3 only, so our dependencies on other data source is also reducing
Read full review
Google
  • cost saving from managing our own data center for ETL servers
  • consumption based pricing
  • with auto scaling feature, we were able to expand components to support work load
Read full review
ScreenShots