AWS Glue vs. Amazon Redshift

AWS Glue

AWS Glue

42 Reviews and Ratings

Amazon Redshift

Amazon Redshift

217 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
AWS Glue	Score 8.7 out of 10	N/A	AWS Glue is a managed extract, transform, and load (ETL) service designed to make it easy for customers to prepare and load data for analytics. With it, users can create and run an ETL job in the AWS Management Console. Users point AWS Glue to data stored on AWS, and AWS Glue discovers data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, data is immediately searchable, queryable, and available for ETL.	$0.44 billed per second, 1 minute minimum
Amazon Redshift	Score 8.9 out of 10	N/A	Amazon Redshift is a hosted data warehouse solution, from Amazon Web Services.	$0.24 per GB per month

Pricing

AWS Glue

Amazon Redshift

Editions & Modules

per DPU-Hour: $0.44
billed per second, 1 minute minimum

Redshift Managed Storage: $0.24
per GB per month
Current Generation: $0.25 - $13.04
per hour
Previous Generation: $0.25 - $4.08
per hour
Redshift Spectrum: $5.00
per terabyte of data scanned

Offerings

Pricing Offerings
AWS Glue	Amazon Redshift
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	AWS Glue	Amazon Redshift
Considered Both Products	AWS Glue SC Sonny Carlos Head of Cloud and Data Business Chose AWS Glue Glue is easier especially if you are already in AWS. It easily integrates to other AWS services. Compliments well with Amazon Athena, S3, and Lake Formation. Compared to Snowflake, it is also much much cheaper and you don't have to build outside AWS. Support is also good if you … Incentivized Helpful? Verified User Anonymous Chose AWS Glue Azure Databricks and Snowflake Incentivized Helpful? Verified User Anonymous Chose AWS Glue Informatica Intelligent Cloud Integration Services and Informatica PowerCenter Incentivized Helpful? Ashutosh Mishra Chose AWS Glue AWS Glue is a fully managed ETL service that automates many ETL tasks, making it easier to set AWS Glue simplifies ETL through a visual interface and automated code generation. Incentivized Helpful? Verified User Anonymous Chose AWS Glue AWS Glue is easier to use and has more and better features compared to it. And more documentation and tutorials and labs are widely available on the internet about AWS Glue which in turn helps in easier implementation of the spark jobs. Auto scaling is an added advantage. It's … Incentivized Helpful? Verified User Anonymous Chose AWS Glue The main reason we choose AWS Glue over Talend open studio 1) Does not support Spark 2) Run only on java 3) not really feasible solution for heavy workloads 4) most of the cases need customer support 5) no proper documentation is available Incentivized Helpful? Verified User Anonymous Chose AWS Glue AWS Glue is a managed service. It was easier for us to integrate it into our stack since we are already an AWS shop. It saved us the headache of managing a 3rd part service. Incentivized Helpful? Verified User Anonymous Chose AWS Glue The cataloging of data objects is the best in the case of AWS Glue. We use AWS Glue in all of our data pipelines to sync external and internal data sources and to automatically produce SQL-based ETL based on AWS Glue catalog objects. Integration with Amazon products is the … Incentivized Helpful? Apurv Doshi Practice Head - Labs (Innovation and R&D) Chose AWS Glue Glue comes in form of a managed service. However, the AWS data pipeline puts additional responsibility to manage the infrastructure. We were not requiring fine-grained control of the hardware which the AWS data pipeline provides. We also want to park our data on DynamoDB. AWS … Incentivized Helpful? Verified User Anonymous Chose AWS Glue We are already in AWS services, so AWS glue is the first choice for us. But for the comparison of ETL job making and process time, it's way faster for other services. Incentivized Helpful?	Amazon Redshift AV Anshuman Varshney Engineering Manager Chose Amazon Redshift Amazon Managed Streaming for Apache Kafka (Amazon MSK) Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Amazon Redshifts has fewer features but at the same time, you also have some gains once it is running on AWS Cloud and it is really easy to set up. Besides that, in our case, it is a bit cheaper and we don't really need the extra features that you can find on Snowflake. Another … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Amazon Redshift, BigQuery, and Snowflake are all fully managed data warehouse services that are designed to handle large volumes of structured data and support business intelligence and analytics efforts. However, Amazon Redshift has the upper hand with its cost-effective … Incentivized Helpful? Dileep Kumar Principal Data Scientist Chose Amazon Redshift Biggest advantage of Amazon Redshift is it's part of the aws ecosystem. When tuned well it is also very cheap compared to something like snowflake. And compared to spark or databricks, Amazon Redshift is a solid warehouse that's well suited for tabular data. We use it for user … Incentivized Helpful? NM Narayan Motamarri Staff Data Engineer Chose Amazon Redshift We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014. Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure. Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a … Incentivized Helpful? Prashast Vaish Decision Scientist Chose Amazon Redshift Redshift is better cost wise and also since the whole ecosystem is set in AWS, it is wise to use redshift Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Redshift leapfrogged Hive back when Hive was trying to figure out how to implement indexes, providing a more stable, standardized (postgres), easy to use (any postgres client), easier to administer, and scalable solution for querying server logs and raw usage data. Now, … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Amazon Redshift is one of the fastest service offerings available in the market now. Plus you get an advantage of using a cutting edge compute service offering from AWS. Other technologies are fast but not as good as Amazon Redshift, I would say. Our business is interested in … Incentivized Helpful? Bojan Sovilj 1st walker Chose Amazon Redshift 1. Redshift has better compression (automated) consuming less space then competitors 2. Automated Vacuum Delete for having consistent performance 3. AWS introduced ra3 node types for simple separation of compute and storage Incentivized Helpful? Duncan Hernandez Sr. Analyst, Business Intelligence Chose Amazon Redshift Its definitely an improvement on all fronts for our business needs. Again, our MySQL server was really slow and we needed a more efficient solution. It was a major upgrade, but it is much more expensive than an in house server. It was expected but I'd say that lots of headaches … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Amazon Redshift supports multiple data formats including multiple structured data formats. And it is easy to implement a cluster if you do not have knowledge of data lake solution. Also when you do not need a lot of resources, you can just scale down so you do not have to spend … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift The best advantage for us was the easy way to integrate our current solution in AWS to Amazon Redshift. Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Google BigQuery, PostgreSQL and Snowflake Incentivized Helpful? Jay Padhya Data scientist Chose Amazon Redshift Amazon Redshift has a better UI, hands down. And it is easy to integrate with bigger tools like Talend. It has many issues when it comes to understanding the architect perspective like Toad, which has a better UI for architect data together. However, that is because we are not … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift We like Snowflake for its separation of computing and storage and also the separation of data warehouse different users. We replaced Redshift with Snowflake. However, Snowflake is great for its pay for performance kind of methodology. Incentivized Helpful? Arthur Zubarev Senior Business Intelligence Consultant Chose Amazon Redshift Azure SQL Database was discarded because of a less attractive licensing, costs, plus its integrates poorly with many of the Azure offerings as say Azure Data Factory - it is not a true ETL yet. Also, the rest of the tools used were of Open Source type and it did not look like a … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift The main reason we chose Redshift was because of the cost-effectiveness of running and maintaining the warehouse. Incentivized Helpful? Akshaya Bhardwaj Consultant Chose Amazon Redshift It works on the cloud and we use the platform Dbeaver which is very unique and easy to maintain. There are very limited tools of this kind but the security issues are pretty high within those tools. Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift As our applications are hosted on AWS service, Redshift is the best option for us. Also, it provide a near to real-time performance on limited datasets and less complex queries. High availability is the major concern for any growing business and AWS is the best option for this. … Incentivized Helpful? Vibhakar Prasad Data Engineer Consultant Chose Amazon Redshift No comment on this. Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift We are currently on Redshift, because it was out before Snowflake. However, Snowflake looks promising. It's the new shiny toy that gives options that Redshift does not provide for. The big thing is that storage and compute can be scaled separately, whereas you cannot do that in … Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Most of our stack is on AWS, so while Snowflake and BigQuery was a viable option from a performance perspective, it was easier to integrate with RedShift. We considered hosting SQL Server on AWS or using Amazon RDS (Postgres or MySQL), however, the self-service aspect of … Incentivized Helpful? Jacob Biguvu Database Engineer/DBA (Cloud and On-Premises) Chose Amazon Redshift Snowflake supports semi-structured data types and provided solutions to manage/process the semi-structured data. It supported sharing data between the different accounts and makes it easy in the scale and scale down process. Snowflake doesn't limit users on the database. Incentivized Helpful? Verified User Anonymous Chose Amazon Redshift Amazon Redshift is much easier to set up and start using. It interacts well with the PostgreSQL client (psql) and shares certain basic data dictionary, and people familiar with PostgreSQL feel right at home. The cluster is part of AWS services offering, and it works well with … Incentivized Helpful? Gavin Hackeling Data Scientist Chose Amazon Redshift Some organizations use PostgreSQL as an OLAP store. PostgreSQL offers a modern SQL dialect, data types, and features that Redshift lacks. RDS is a great managed PostgreSQL product. However, PostgreSQL is a poor choice for a data warehouse. It's row-oriented storage requires … Incentivized Helpful?

Best Alternatives
	AWS Glue	Amazon Redshift
Small Businesses	IBM SPSS Modeler Score 9.5 out of 10	Google BigQuery Score 8.7 out of 10
Medium-sized Companies	IBM InfoSphere Information Server Score 8.0 out of 10	Snowflake Score 8.7 out of 10
Enterprises	IBM InfoSphere Information Server Score 8.0 out of 10	Snowflake Score 8.7 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	AWS Glue	Amazon Redshift
Likelihood to Recommend	9.0 (0 ratings)	9.0 (0 ratings)
Usability	9.4 (0 ratings)	9.0 (0 ratings)
Support Rating	7.0 (0 ratings)	9.0 (0 ratings)

User Testimonials
	AWS Glue	Amazon Redshift
Likelihood to Recommend	When the data which requires ETL has different formats, schema, and volume, this service suits them best. So, when the volume is not consistent (typical use-case of healthcare and online shopping), AWS Glue can be the prime choice. When the data is available in both batch and streaming mode, the developer needs to generate a separate codebase. This increases the source code management efforts. So, prefer to go with Glue when the nature of the data is the same (either batched or streamed). Incentivized Apurv Doshi Practice Head - Labs (Innovation and R&D) Read full review	If the number of connections is expected to be low, but the amounts of data are large or projected to grow it is a good solutions especially if there is previous exposure to PostgreSQL. Speaking of Postgres, Redshift is based on several versions old releases of PostgreSQL so the developers would not be able to take advantage of some of the newer SQL language features. The queries need some fine-tuning still, indexing is not provided, but playing with sorting keys becomes necessary. Lastly, there is no notion of the Primary Key in Redshift so the business must be prepared to explain why duplication occurred (must be vigilant for) Incentivized Arthur Zubarev Senior Business Intelligence Consultant Read full review
Pros	After data cleansing, the team also implemented the best practices for using AWS platform services as a Data Lake, such as job bookmarking for AWS Glue jobs, proper delimiter for the AWS Glue crawlers, partitioning in AWS S3, and transformation to parquet file for compression and faster querying time in Amazon Athena. Data modernization through combining data from multiple sources into a functioning datasets, rebuilding DW, and resctructuring data sources. Aims to lessen customer complaints, eliminate manual data extraction requests via SR from different data sources, and Increase accuracy, consistency and speed up reconciliation process. Incentivized SC Sonny Carlos Head of Cloud and Data Business Read full review	Redshift is fully managed. Small teams do not have the resources to maintain a cluster. CloudWatch metrics are provided out-of-the-box, and it is easy to configure alarms. Redshift's console allows you to easily inspect and manage queries, and manage the performance of the cluster. Redshift is ubiquitous; many products (e.g., ETL services) integrate with it out-of-the-box. Writing .csvs to S3 and querying them through Redshift Spectrum is convenient. Incentivized Gavin Hackeling Data Scientist Read full review
Cons	It’s integration with other cloud vendors is bit difficult If it can support non SQL based databases as well, it would be powerful. Real time data synchronisation in data source is missing Incentivized Verified User Anonymous Read full review	It could benefit from adding data integrity and programming tools common to other database management systems. Amazon Redshift is based on PostgreSQL 8.0.2. That version of PostgreSQL was released in December 2006. While PostgreSQL was much improved since then, the new features were not implemented in Redshift. Many basic features are missing from it. Primary keys can be declared but not enforced. Referential integrity (foreign keys) can be declared but not enforced. UNIQUE and CHECK constraints are not supported and cannot be declared. IDENTITY can be declared on a column, and Redshift will put unique values into it. However: IDENTITY values in the newly inserted rows won’t be incremental or sequential. To implement a sequential number, you need to write your own custom code. There are no stored procedures in Redshift. We are writing SQL script files, and then parsing and running them one statement at a time from a Python program. This also enabled us to implement execution-time error logging. In SQL scripts, to check for the row count of affected rows, a complicated join query against some system tables or views has to be executed. Data Control Language (DCL) does not exist. No statements like IF, WHILE, DO, RAISERROR, etc. On performance of views… Views do not “pass-through” a query parameter which is a potential problem for performance. When selecting against a view with the WHERE clause outside of the view, the inner query of the view will be executed first without consideration for the WHERE clause, and only then the WHERE clause will be applied. Certain clauses of SQL work many times faster than other clauses. So be careful and test your statements for performance earlier rather than later, especially if working with a large data set. There was a situation when DELETE FROM JOIN was unacceptably slow. Replacing JOIN with the USING clause made DELETE instantaneous. Incentivized Michael Romm Principal Data Architect Read full review
Usability	I personally found it very usable for a data engineer's day job, particularly for performing ETL and managing the data pipelines. Incentivized Verified User Anonymous Read full review	Overall it serves all our aspects of data management like data cleaning, data manipulation, and data reporting on the cloud platform. We can create stored procedures and triggers in it very easily as all the options are self suggested in it. We can easily attach the results of ARS to the other tools as well for drawing the statistical results. Incentivized Akshaya Bhardwaj Consultant Read full review
Support Rating	Amazon responds in good time once the ticket has been generated but needs to generate tickets frequent because very few sample codes are available, and it's not cover all the scenarios. Incentivized Verified User Anonymous Read full review	The support was great and helped us in a timely fashion. We did use a lot of online forums as well, but the official documentation was an ongoing one, and it did take more time for us to look through it. We would have probably chosen a competitor product had it not been for the great support Incentivized Verified User Anonymous Read full review
Alternatives Considered	The cataloging of data objects is the best in the case of AWS Glue. We use AWS Glue in all of our data pipelines to sync external and internal data sources and to automatically produce SQL-based ETL based on AWS Glue catalog objects. Integration with Amazon products is the other advantage. Incentivized Verified User Anonymous Read full review	We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014. Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure. Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a data store. [Amazon] Redshift was ideal with the price structure, performance and ROI[.] Incentivized NM Narayan Motamarri Staff Data Engineer Read full review
Return on Investment	Positive Impact :- after ETL we can able to do some kind of automation Negative :- At some point of time it can hamper the cost but not really Incentivized Verified User Anonymous Read full review	It allows for an almost seamless integration of our data which can then be used by other departments for analytical purposes. No in house resources are needed for keeping the data alive and performing backup/migration tasks of the data in its end state. Incentivized Brendan McKenna Senior Developer Read full review
ScreenShots