A hands-free data warehouse
October 12, 2018

A hands-free data warehouse

Gavin Hackeling | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User

Overall Satisfaction with Amazon Redshift

We use Amazon Redshift and Redshift Spectrum for our data warehouse. Our production transactional datastores are continuously replicated to Redshift and transformed into fact tables. Redshift is maintained by the data team, but it is used by analysts on most teams, including business intelligence, product, and customer support. Redshift is our source of truth; it provides information about business processes that the team needs to make decisions.
  • Redshift is fully managed. Small teams do not have the resources to maintain a cluster. CloudWatch metrics are provided out-of-the-box, and it is easy to configure alarms.
  • Redshift's console allows you to easily inspect and manage queries, and manage the performance of the cluster.
  • Redshift is ubiquitous; many products (e.g., ETL services) integrate with it out-of-the-box.
  • Writing .csvs to S3 and querying them through Redshift Spectrum is convenient.
  • We've experienced some problems with hanging queries on Redshift Spectrum/external tables. We've had to roll back to and old version of Redshift while we wait for AWS to provide a patch.
  • Redshift's dialect is most similar to that of PostgreSQL 8. It lacks many modern features and data types.
  • Constraints are not enforced. We must rely on other means to verify the integrity of transformed tables.
  • It is essential for all teams to refer to the same source of truth. Redshift serves as that store of truth. Product managers and analysts can use a variety of clients to answer their own questions; data analysts are not overwhelmed with ad-hoc queries.
  • Redshift is fully managed; our engineers spend their time building features rather than maintaining infrastructure.
  • It is often simpler to write objects to S3 than load data into a table; Spectrum provides a useful shortcut for tedious engineering work.
Some organizations use PostgreSQL as an OLAP store. PostgreSQL offers a modern SQL dialect, data types, and features that Redshift lacks. RDS is a great managed PostgreSQL product. However, PostgreSQL is a poor choice for a data warehouse. It's row-oriented storage requires careful schema and index design to ensure analytical queries perform adequately.
Redshift is ideal for small teams. It is fully managed. CloudWatch metrics are provided out-of-the-box, and it integrates well with other AWS products, such as DMS. The Redshift console is among the better AWS consoles. Redshift offers adequate performance. Spectrum offers a convenient way to access our data lake, but we have encountered issues with recent versions.