TrustRadius: an HG Insights company
AWS Glue Logo

AWS Glue Reviews and Ratings

Rating: 8.6 out of 10
Score
8.6 out of 10

Reviews

10 Reviews

AWS Glue Partner Review

Rating: 10 out of 10
Incentivized

Use Cases and Deployment Scope

As an AWS Advanced Consulting Partner, we use AWS Glue in many of our Data and Analytics Solutions. We've implemented in the major enterprises in the Philippines that are in the media, telecommunication, logistics and Fintech industries. The company aims to centralize their data lake of operational raw data containing various shipping details by making use of the AWS platform.The architecture must involve an automation of the data extraction from an API. The data lake should also be visualized to provide graphical details using QuickSight, and the generated dashboards are to be embedded into the customer web portal. AWS Services implemented - Lambda, S3, Glue, Athena, Quicksight, EventBridge

Pros

  • After data cleansing, the team also implemented the best practices for using AWS platform services as a Data Lake, such as job bookmarking for AWS Glue jobs, proper delimiter for the AWS Glue crawlers, partitioning in AWS S3, and transformation to parquet file for compression and faster querying time in Amazon Athena.
  • Data modernization through combining data from multiple sources into a functioning datasets, rebuilding DW, and resctructuring data sources.
  • Aims to lessen customer complaints, eliminate manual data extraction requests via SR from different data sources, and Increase accuracy, consistency and speed up reconciliation process.

Cons

  • Faster processing, on cases where data is not partitioned efficiency
  • Cost optimization and pricing
  • Developer experience on new users

Likelihood to Recommend

Operational Excellence: A customer asked for guidance mostly from data ingestion to transformation. However, we advised the customer to use Amazon CloudWatch to monitor their own AWS Glue jobs since when we fix their glue job errors, we rely more on CloudWatch information to resolve issues.

AWS Glue-The ETL Friend.

Rating: 9 out of 10
Incentivized

Use Cases and Deployment Scope

Utilized AWS Glue for the ETL process in a healthcare domain, where data from claims-related 837 and remittance-related 835 JSON files was ingested into Delta tables. Used it for data cleansing, validation, and processing as per business needs. During processing, AWS Glue uses Apache Spark environments to run transformation scripts in Python.

Pros

  • ETL business logic.
  • Monitoring.
  • Data Lineage.
  • Data migration.

Cons

  • AI based agent to have data related questions answered.
  • DQ tool for data quality checks as per business rules.

Likelihood to Recommend

AWS Glue is a perfect choice for data engineers to perform ETL in a manageable way.

Vetted Review
AWS Glue
3 years of experience

Think of ETL: Think AWS GLUE

Rating: 7 out of 10
Incentivized

Use Cases and Deployment Scope

We have certain transformations needs, we use Glue to fulfil them. As its integrated service from AWS itself, the connectivity to other AWS service is pretty seamless. As it’s fully serverless, we don’t have to worry about the infrastructure as well. It can crawl the things for us so we don’t have worry about the updation in our source, we got to know itself. So it’s powerful ETL FOR US.

Pros

  • Scale up and scale down easily
  • Seamless connectivity with other AWS services
  • Cost effective as you need to pay what you are using.

Cons

  • It’s integration with other cloud vendors is bit difficult
  • If it can support non SQL based databases as well, it would be powerful.
  • Real time data synchronisation in data source is missing

Likelihood to Recommend

To integrate within AWS services, it’s great ETL tool which can work perfectly with our S3 buckets and SQL DB.

but when it comes to integrate it with Multi cloud infrastructure, its connectivity and compatibility lacks, i.e. if we have to connect to IBM cloud objects, it’s not possible in any way to have our data source being in other cloud vendor.

Vetted Review
AWS Glue
3 years of experience

Software developer

Rating: 7 out of 10

Use Cases and Deployment Scope

The main concern in AWS Glue is to so much costing of Glue jobs and I was worked with 5000 dataset I was facing some performance issue as compare to cost they need to work on performance aand also reduced our time utilisation to save time using this method. AWS Glue integrates with services like Amazon Redshift, Amazon Athena, and Amazon QuickSight, enabling organizations to analyze data with their preferred analytics tools.

Pros

  • Data integration
  • Data transformation
  • Job scheduling

Cons

  • Complexity transformation
  • Debugging and monitoring
  • Custom connectors

Likelihood to Recommend

AWS Glue is well-suited for data warehousing scenarios where you need to extract, transform, and load data into a centralized repository like Amazon Redshift. It simplifies the ETL.It's a great choice for preparing data in data lakes, especially when dealing with diverse data sources and formats. Glue can help normalize and structure data for analytics.

AWS Glue ETL tool

Rating: 8 out of 10
Incentivized

Use Cases and Deployment Scope

We use AWS Glue to creat Etl pipelines for transforming and moving of data from different data sources like S3, snowflakes, postgres to Redshift and vice versa. Execution of spark jobs is really easy as it has auto generated code which establishes connections with source and target data bases securedly and helps in the cleansing of data like deduplication and performing validations on data. As it is Serverless it will automatically scale up and scale down the memory resources required to run the spark glue job.

Pros

  • Execution of spark jobs
  • Scaling of memory resources
  • Crawling the schemas

Cons

  • Incremental data sync
  • Real time data triggers
  • Grouping of small files

Likelihood to Recommend

ETL operations and jobs are well suited to perform with glue. If we want to transform or extract data from the data sources specially in the data stored in the AWS cloud . It is very well integrated with the other AWS services. It is easier to establish connections. We can schedule the crawlers or run on demand.

Vetted Review
AWS Glue
3 years of experience

Great for ETL and batch processing

Rating: 8 out of 10
Incentivized

Use Cases and Deployment Scope

1) In my current use case we mainly use AWS Glue for Extract Transform Load to process batch data on daily basis. 2) the main problem we can able to solve or we can say the solution which Glue provides that is, it can easily integrate with other AWS services like S3, RDS, Athena 3) pricing model is also very like pay-as-you-go 4) the main business problem which glue solve we can ingest the data ewe can perform ETL on top that and can create spark or python shell jobs

Pros

  • Extract , transform , Load
  • AWS Data catalog
  • triggers
  • we can create workflows

Cons

  • In-Stream schema registries feature people can not use this more efficiently
  • in Connections feature they can add more connectors as well
  • The crucial problem with AWS Glue is that it only works with AWS.

Likelihood to Recommend

well suited:- when you want to use it to transform your data then glue also provides there own transformation also in that option you can able to do PII masking of Data if you don't want to use any code approach. The second scenario would be when want to integrate glue with other AWS services. and also wants to run Spark on glue for faster processing. less appropriate:- If you want to integrate with other services which are outside of AWS. it does not support Java as of now so if you have java resources then you can not run it.

Vetted Review
AWS Glue
1 year of experience

AWS Glue is a good data catalog and integration service

Rating: 9 out of 10
Incentivized

Use Cases and Deployment Scope

We heavily rely on AWS Glue for cataloging our data objects (tables and views). We use AWS Glue as our Data Catalog and use it in our data pipelines to sync external and internal data sources. We also utilize AWS Glue to auto-generate SQL-based ETL based on AWS Glue catalog objects.

Pros

  • Create schemes, tables and views (data catalog).
  • Sync external and internal data sources.
  • Auto-generate SQL-based data pipelines, based on AWS Glue catalog objects.

Cons

  • It is very difficult (almost impossible) to scale
  • We sometimes get throttled by service limitations.
  • AWS Glue crawlers sometimes mismatch the data in the files

Likelihood to Recommend

AWS Glue is a mature product, which helps organizations start their journey with data exploration and analysis. AWS Glue has many great features, like a data catalog, jobs, crawlers, helping non-engineers to handle data and build a data lake.

Vetted Review
AWS Glue
4 years of experience

Unmatchable serverless computing.

Rating: 9 out of 10
Incentivized

Use Cases and Deployment Scope

The automation of numerous tasks, including logging, alerting, monitoring, etc., is made possible by AWS Glue. Additionally, it is economical because you only pay for the resources you actually use. One of AWS Glue's most notable features that are helpful is that it aids in the generation and transformation of data in its data catalog.

Pros

  • Helps in data creation and transformations.
  • Automation of the data schema recognition.
  • Support and scheduling of the data schema.

Cons

  • Integration with systems outside of the AWS environment.
  • Glue runs on spark so the engineer should be aware of the language.

Likelihood to Recommend

One of AWS Glue's most notable features that aid in the creation and transformation of data is its data catalog. Support, scheduling, and the automation of the data schema recognition make it superior to its competitors aside from that. It also integrates perfectly with other AWS tools. The main restriction may be integrated with systems outside of the AWS environment. It functions flawlessly with the current AWS services but not with other goods. Another potential restriction that comes to mind is that glue operates on a spark, which means the engineer needs to be conversant in the language.

Vetted Review
AWS Glue
3 years of experience

AWS Glue - The managed ETL service for your data

Rating: 9 out of 10
Incentivized

Use Cases and Deployment Scope

We use AWS Glue for ETL of the healthcare data. The input data come from different source systems and so with different formats. With help of the AWS Glue jobs, we translate the data into a common format. With help of python scripts and the scheduled job feature, the data is fetched in a periodic manner, processed with help of the python script, converted to the parquet format, and stored in the S3 bucket. The glue catalog generates the schema of the stored data and allows AWS Athena to query the same for analytics purposes.

Pros

  • It is extremely fast, easy, and self-intuitive. Though it is a suite of services, it requires pretty less time to get control over it.
  • As it is a managed service, one need not take care of a lot of underlying details. The identification of data schema, code generation, customization, and orchestration of the different job components allows the developers to focus on the core business problem without worrying about infrastructure issues.
  • It is a pay-as-you-go service. So, there is no need to provide any capacity in advance. So, it makes scheduling much easier.

Cons

  • The sample code should cover more scenarios. They are quite basic. However, you can find good pointers from the internet and AWS community and tickets.
  • AWS Glue runs on Apache Spark. So, to take the best of the AWS Glue service, the developer should have a good idea of Apache Spark.

Likelihood to Recommend

When the data which requires ETL has different formats, schema, and volume, this service suits them best. So, when the volume is not consistent (typical use-case of healthcare and online shopping), AWS Glue can be the prime choice. When the data is available in both batch and streaming mode, the developer needs to generate a separate codebase. This increases the source code management efforts. So, prefer to go with Glue when the nature of the data is the same (either batched or streamed).

AWS Glue : a fully managed ETL service

Rating: 9 out of 10
Incentivized

Use Cases and Deployment Scope

One of the straightforward and quick cloud-based ETL tools is AWS Glue. It comes under the umbrella of AWS services. We use AWS Glue to analyze an extensive data set of USA based clinics and hospitals. Its HIPAA compliance for sensitive data. It comes with the support of python script, Schedular, and works very well with other AWS services like s3, rds.

Pros

  • Very quick for ETL job.
  • UI as well Command Interface with very few steps to create and schedule ETL Job.

Cons

  • Sample Code is very basic and not available in most of the scenario.

Likelihood to Recommend

AWS glue is best if your organization is dealing with large and sensitive data like medical record. Its comes with scheduler and easy deployment for AWS user. The data catalog keeps the reference of the data in a well-structured format. If you are already part of the AWS services, then AWS Glue is the best choice; otherwise, it's not a simple one for deployment.

Vetted Review
AWS Glue
3 years of experience