efficient, performant data store
August 09, 2021

efficient, performant data store

Narayan Motamarri | TrustRadius Reviewer
Score 10 out of 10
Vetted Review
Verified User

Overall Satisfaction with Amazon Redshift

Amazon Redshift is our Data Warehouse, where we store our processed data (Hot data) for various initiatives like BI, Analytics, DataScience, etc

We also use Amazon Redshift Spectrum as our Data Lake, where we store raw (un-processed) data (Cold data) for historical analysis, trends, etc

We store various standard data in Redshift like:
Bronze (ETL-ed data),
Silver (Materialized Views data), and
Gold (Rollups/Aggregated/Dashboard-ready data) in [Amazon] Redshift

  • [Amazon] Redshift has Distribution Keys. If you correctly define them on your tables, it improves Query performance. For instance, we can define Mapping/Meta-data tables with Distribution-All Key, so that it gets replicated across all the nodes, for fast joins and fast query results.
  • [Amazon] Redshift has Sort Keys. If you correctly define them on your tables along with above Distribution Keys, it further improves your Query performance. It also has Composite Sort Keys and Interleaved Sort Keys, to support various use cases
  • [Amazon] Redshift is forked out of PostgreSQL DB, and then AWS added "MPP" (Massively Parallel Processing) and "Column Oriented" concepts to it, to make it a powerful data store.
  • [Amazon] Redshift has "Analyze" operation that could be performed on tables, which will update the stats of the table in leader node. This is sort of a ledger about which data is stored in which node and which partition with in a node. Up to date stats improves Query performance.
  • Amazon Redshift is a Managed Service. But it is Not a 100% managed service. We still need to configure it with WLM (Work Load Management) settings, and add Query Queues to make sure it's resources aren't wasted and it is performant at it's best state, all the time
  • [Amazon] Redshift has a concept of "Vacuum", which is an operation to claim the disk space back from deleted data/tables. They recently started doing automated vacuuming. Prior to that we had to do that at regular intervals, to claim the data back.
  • MPP (Massively Parallel processing)
  • Column Oriented data store
  • Good Customer Support
  • Greater ROI, as it is 1/10th the cost of traditional data stores and data warehouses.
  • it is connected to Tableau and Looker dashboards, and various reporting used by Sales, Marketing, Publishers, Operations, BI, Analytics, DataScience, Finance
We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014.
Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure.
Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a data store.
[Amazon] Redshift was ideal with the price structure, performance and ROI[.]

Do you think Amazon Redshift delivers good value for the price?


Are you happy with Amazon Redshift's feature set?


Did Amazon Redshift live up to sales and marketing promises?


Did implementation of Amazon Redshift go as expected?


Would you buy Amazon Redshift again?


[Amazon] Redshift is suited for various use cases like Time series data, Structured / relational data, Semi structured data like JSON, etc.

[Amazon] Redshift might not work 100% well with full performance, for Graph DB use cases.