Amazon Redshift and Snowflake are both data warehouse software designed to store data for querying, analysis, and extraction. Mid-sized businesses are the most frequent users of both data warehouse solutions. Large enterprises are more likely to use Snowball, likely due to its ability to work with many types of data. Smaller businesses tend to use Amazon Redshift due to its usability and affordable pricing.
Amazon Redshift and Snowflake both offer essential data warehouse features, including storage and querying data, but they also have some standout features that set them apart.
Amazon Redshift provides end-to-end encryption and highly customizable security features at all price points. Additionally, Amazon Redshift offers many integrations with other technologies, particularly with technologies on the AWS platform. Lastly, Amazon Redshift requires less technical expertise to use as it has a usable interface and secure backup features.
Snowflake provides native support for JSON documents, allowing for built-in functions and querying for JSON data. Additionally, Snowflake includes robust automation tools for data vacuuming and compression, which can save time for users. Lastly, Snowflake can handle complex data types like arrays and objects, making it a great choice for businesses with a lot of complex data.
Though Amazon Redshift and Snowflake both offer robust data warehouse features, they also have some limitations that are important to consider.
Amazon Redshift has limited JSON support. By default, it splits all JSON data into strings, which can make it difficult to query and analyze. Database management for Amazon Redshift mostly has to be handled manually, as opposed to Snowflakes automation features. Lack of automation may make Amazon Redshift a poor choice for companies with small technical teams that can benefit from automation. Lastly, Amazon Redshift provides limited support for complex data types like arrays and objects.
Snowflake offers advanced security features, but they are not available at every price point. Users with small data storage needs, but important security needs may struggle to find the right Snowflake package for them. Snowflake also lacks the seamless AWS technology integrations that Amazon Redshift provides. Businesses with few technical staff members may also struggle with Snowflake’s user interface and backup features.
Amazon Redshift offers a two-month trial where users can see if they like the technology before purchasing. Beyond the trial, Amazon Redshift is priced depending on storage needs and feature requirements. Pricing can start as low as $0.25 per hour, and reach as high as $13.04 per hour.
Snowflake offers a one-month free trial for users to try the software. Pricing for Snowflake is dependent on storage needs, platform, and geographic region and starts as low as $2.00 per hour. Snowflake includes scaling pricing based on workload, so a business with a rapidly changing workload may have some savings with Snowflake.
Provided by the TrustRadius Research Team
Published on April 24, 2020
Likelihood to Recommend
- Redshift is fully managed. Small teams do not have the resources to maintain a cluster. CloudWatch metrics are provided out-of-the-box, and it is easy to configure alarms.
- Redshift's console allows you to easily inspect and manage queries, and manage the performance of the cluster.
- Redshift is ubiquitous; many products (e.g., ETL services) integrate with it out-of-the-box.
- Writing .csvs to S3 and querying them through Redshift Spectrum is convenient.
- Resources that scale up and down automatically to ensure that queries run quickly and efficiently without paying for computing power that is not being used
- Much more reliable than our previous software
- No noticable limit to query size
- Runs very quickly
- It could benefit from adding data integrity and programming tools common to other database management systems.
- Amazon Redshift is based on PostgreSQL 8.0.2. That version of PostgreSQL was released in December 2006. While PostgreSQL was much improved since then, the new features were not implemented in Redshift. Many basic features are missing from it.
- Primary keys can be declared but not enforced. Referential integrity (foreign keys) can be declared but not enforced. UNIQUE and CHECK constraints are not supported and cannot be declared.
- IDENTITY can be declared on a column, and Redshift will put unique values into it. However: IDENTITY values in the newly inserted rows won’t be incremental or sequential. To implement a sequential number, you need to write your own custom code.
- There are no stored procedures in Redshift. We are writing SQL script files, and then parsing and running them one statement at a time from a Python program. This also enabled us to implement execution-time error logging.
- In SQL scripts, to check for the row count of affected rows, a complicated join query against some system tables or views has to be executed.
- Data Control Language (DCL) does not exist. No statements like IF, WHILE, DO, RAISERROR, etc.
- On performance of views… Views do not “pass-through” a query parameter which is a potential problem for performance.
- When selecting against a view with the WHERE clause outside of the view, the inner query of the view will be executed first without consideration for the WHERE clause, and only then the WHERE clause will be applied.
- Certain clauses of SQL work many times faster than other clauses. So be careful and test your statements for performance earlier rather than later, especially if working with a large data set.
- There was a situation when DELETE FROM JOIN was unacceptably slow. Replacing JOIN with the USING clause made DELETE instantaneous.
- Very limited amount of tabs - saved queries, which requires us to store the code somewhere else and re-use existing queries.
- Performance can really be a problem if there are many users on the system at the same time.
- SnowFlake support sometimes can be hard to reach.
Likelihood to Renew
Than BigQuery: Redshift has a standard SQL interface, though recently I heard good things about BigQuery and would try it out again.
Than Hive: Hive is great if you are in the PB+ range, but latencies tend to be much slower than Redshift and it is not suited for ad-hoc applications.
Return on Investment
- Redshift has had a very positive impact on our business. It has been used to provide analytics on marketing campaigns to boost revenue.
- Redshift is instrumental in our payment collection business processes. It powers everything from who gets called to who gets sent collection emails.
- Quick installation/setup for a data warehouse solution.
- Make easy to handle/manage the various type of semi-structured data using the native solutions and provided new data modeling concepts such as schema-read data model and schema-write data model.
- Make easy and simplified the workload management using virtual warehouses and materialized views.
Premium Consulting/Integration Services—
Entry-level set up fee?
Amazon Redshift Editions & Modules
|Current Generation||$0.25 - $13.041|
|Previous Generation||$0.25 - $4.081|
|Redshift Managed Storage||$0.243|
- per hour
- per terabyte of data scanned
- per GB per month