Item: Amazon Redshift
Rating: 10
Author: Narayan Motamarri

Use Cases and Deployment Scope

Amazon Redshift is our Data Warehouse, where we store our processed data (Hot data) for various initiatives like BI, Analytics, DataScience, etc

We also use Amazon Redshift Spectrum as our Data Lake, where we store raw (un-processed) data (Cold data) for historical analysis, trends, etc

We store various standard data in Redshift like:
Bronze (ETL-ed data),
Silver (Materialized Views data), and
Gold (Rollups/Aggregated/Dashboard-ready data) in [Amazon] Redshift

Pros and Cons

[Amazon] Redshift has Distribution Keys. If you correctly define them on your tables, it improves Query performance. For instance, we can define Mapping/Meta-data tables with Distribution-All Key, so that it gets replicated across all the nodes, for fast joins and fast query results.
[Amazon] Redshift has Sort Keys. If you correctly define them on your tables along with above Distribution Keys, it further improves your Query performance. It also has Composite Sort Keys and Interleaved Sort Keys, to support various use cases
[Amazon] Redshift is forked out of PostgreSQL DB, and then AWS added "MPP" (Massively Parallel Processing) and "Column Oriented" concepts to it, to make it a powerful data store.
[Amazon] Redshift has "Analyze" operation that could be performed on tables, which will update the stats of the table in leader node. This is sort of a ledger about which data is stored in which node and which partition with in a node. Up to date stats improves Query performance.

Amazon Redshift is a Managed Service. But it is Not a 100% managed service. We still need to configure it with WLM (Work Load Management) settings, and add Query Queues to make sure it's resources aren't wasted and it is performant at it's best state, all the time
[Amazon] Redshift has a concept of "Vacuum", which is an operation to claim the disk space back from deleted data/tables. They recently started doing automated vacuuming. Prior to that we had to do that at regular intervals, to claim the data back.

Most Important Features

MPP (Massively Parallel processing)
Column Oriented data store
Good Customer Support

Return on Investment

Greater ROI, as it is 1/10th the cost of traditional data stores and data warehouses.
it is connected to Tableau and Looker dashboards, and various reporting used by Sales, Marketing, Publishers, Operations, BI, Analytics, DataScience, Finance

Alternatives Considered

Google BigQuery and Amazon EMR (Elastic MapReduce)

We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014.
Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure.
Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a data store.
[Amazon] Redshift was ideal with the price structure, performance and ROI[.]

Key Insights

Do you think Amazon Redshift delivers good value for the price?

Yes

Are you happy with Amazon Redshift's feature set?

Yes

Did Amazon Redshift live up to sales and marketing promises?

Yes

Did implementation of Amazon Redshift go as expected?

Yes

Would you buy Amazon Redshift again?

Yes

Other Software Used

Google BigQuery, Snowflake, Databricks Lakehouse Platform (Unified Analytics Platform), Amazon EMR (Elastic MapReduce)

Likelihood to Recommend

[Amazon] Redshift is suited for various use cases like Time series data, Structured / relational data, Semi structured data like JSON, etc.

[Amazon] Redshift might not work 100% well with full performance, for Graph DB use cases.

efficient, performant data store

Overall Satisfaction with Amazon Redshift