Amazon Athena Pricing 2022
The bigger your relational database is, the faster you’ll need to access its data for analytics and operations. Amazon Athena is an interactive query service that provides teams with fast analysis queries for small and large relational databases. It offers teams convenience, speed, and tons of unlocked potential when used with other cloud computing applications.
What Is Amazon Athena?
Amazon Athena is an Amazon Web Services cloud computing application. The AWS Athena query service allows you to analyze and pull data very quickly. It has a simple structure using standard SQL in a serverless environment that takes away management on your part.
This interactive query service allows teams to optimize their real-time data analytics. It integrates with AWS services efficiently and can be used for arrays, window functions, and big data joins. The built-in scalability of the platform allows it to handle large datasets, ad-hoc, and complex queries while maintaining its high-performance speeds.
Athena deploys the Presto query engine for running SQL. The application currently accepts several data formats for storing and transforming data. These include JSON, ORC, Apache Parquet, and CSV. When you store data in file formats like JSON, it makes it easier to populate tables with external data.
You can also unlock capabilities with Amazon Athena if you integrate with other AWS services. For example, Amazon QuickSight helps you create amazing data visualizations. You can find the full list of services you can choose to integrate with Athena here.
For a quick overview, the table below has the main points you need to know about Amazon Athena.
What Is Amazon Athena?
They are an interactive query service.
They are serverless, which means the service doesn’t require setup or management.
They can be used as an ETL tool. This means they extract, transform, then load your data.
What Can You Use It For?
You can use it to help analyze data and run ad-hoc queries.
You can use it in tandem with other AWS services for visualizations, or data modeling, and business intelligence (BI) needs.
For more information about specific use cases for Athena, see the guide here.
How Do You Get Started?
First, you need to pull from your data sources and add them to Amazon S3.
Then you will be ready to log into the Athena Management Console and define your schema.
From there you can begin running queries using the query editor.
Amazon Athen has awesome functionality, but are they the best solution for you?
What Are The Pros And Cons Of Using Amazon Athena?
This is a very complex service like most other AWS cloud software solutions. To decide whether or not you want to pursue AWS Athena, we created an easy pros and cons table. This won’t include all the information about the entire service. It just covers the diverging points that stand out.
Amazon Athena Pros
Amazon Athena Cons
Pricing is cost-effective because it’s based on usage so you won’t have to pay for what you don’t need.
Pricing is based on usage so you can only estimate the bill before you see it.
You can use it with other AWS services and achieve more powerful BI insights, data analysis, and reporting.
It requires Amazon S3 and can run into other AWS applications to drive up your bill.
The application is complex and can meet a variety of data analysis needs.
The application isn’t beginner-friendly and does have a learning curve.
For further learning about how AWS Athena works, and what it is, see their FAQ page here, and the Amazon Athena user guide here. Below is also a service overview for using AWS Athena for Big Data analysis.
Here is a tutorial on how to perform SQL queries in AWS Athena. This will give you a comprehensive and step-by-step walkthrough of how to navigate through your data source and Athena.
At the very end of this article you will have a list of key terms as well. If you or your team aren’t that familiar with some terms you come across you can go to the end for some extra background.
What Does Amazon Athena Cost?
The pricing model for Amazon Athena can appear simple, but that’s not quite true. Your charges are based on the bytes scanned when you run a query on your data.
They require all queries to be 10 megabytes (MB), and when your usage costs reach one terabyte (TB) you are charged $5 per query run. This means the bigger the database the more expensive your queries can be.
This seems pretty affordable and easy to follow, but that does not include the additional costs from other AWS applications. Amazon S3 is required for storage which means their standard charges apply.
Those rates are not only for the data you store but also for requests and data transfer. All of your query results go into an S3 bucket that has standard storage rates applied there as well. We highly recommend looking at the Amazon S3 pricing page here.
Amazon Athena Charges
Price per Query
$5.00 per TB of data scanned
$0.023 per GB on first 50 TB each Month
AWS recommends using columnar formats, partitioning, and compressing data for cost savings. They report that teams can save around 30%-90% by managing their data.
Other related applications include AWS Glue Data Catalog, AWS Lambda, Amazon Redshift, Amazon QuickSight, and Amazon Sagemaker. These applications are not exactly required for you to successfully use Amazon Athena, but it's likely that you will use them. You should keep in mind that each of these applications will also have other applications they could run into or pair well with.
AWS Glue is a data integration service that you can use with AWS Athena. It’s serverless, offers a code or codeless interface, and gives data engineers the ability to extract transform data (ETL) in Athena. You can find their AWS pricing page here.
AWS Lambda is one of AWS compute services. They are serverless and allow teams to run code without managing servers. You can find their AWS pricing page here, and you can find other AWS compute resources here.
AWS Redshift is for scaling your data. It can scale petabytes of data. One petabyte (PB) is over a thousand terabytes (TB). Using this service can help you scale your data for cost savings. You can find their AWS pricing page here.
AWS QuickSight allows teams to create data visualizations and use predictive analytics with machine learning (ML). You can use those visualizations for business intelligence (BI) analytics for your team. You can find their AWS pricing page here.
Amazon Sagemaker is a data modeling program that uses machine learning. You can take your data to the next level and make predictive ML models, especially for BI needs. You can find their AWS pricing page here. We also have a long form pricing article that goes more in depth about Amazon Sagemaker here.
Amazon Athena Pricing Examples and Calculator
When the only pricing details you have to consider are usage-based rates and individual costs, it becomes a tangled mess to figure out an estimate. Luckily, AWS does provide examples and a calculator so you can estimate your needs more easily.
The example they provide you is for two very detailed scenarios. One scenario is an uncompressed, long column table that will take longer to scan. The other scenario is compressed and uses columnar format so that only the relevant part of the column is scanned rather than every single column. The uncompressed costs $15 per query and the other comes out to only $1.67.
For the chance to personalize your estimate you should use the pricing calculator. There you will get a closer look at the query cost for your database size. You can set up the number of queries you will run per day, or month.
Our estimate is really just a placeholder for a very small database. We used the specs for the minimum amount of data that queries are allowed to scan, which again is 10MB. We did 20 queries per day, scanning a total of 200MB.
The pricing calculator does not take into account the Amazon S3 rates for your storage or any other services you may end up using.
The cost for this lowest minimum came to a $0.58 monthly cost and yearly cost of $6.96.
For a better understanding of your possible costs you should configure the calculator yourself, and if your team has a strong interest, contact them for a quote.
If you want to learn more about Amazon Athena, Amazon Web Services has a YouTube tutorial below that goes over the usage and cost reports for Amazon Athena.
What Are Alternatives To Amazon Athena?
Microsoft Azure, another major cloud computing provider, offers their own query data service. It’s called Azure Synapse Analytics and they are an analytics platform. You can use them for data integration, as a data warehouse, for big data analytics, and of course to run queries on your data. They can be offered serverless or as a dedicated service that you manage.
The platform is designed to give teams a unified experience, so they can draw conclusions using business intelligence and artificial intelligence (AI) with machine learning. The cloud software comes with a variety of premium features to optimize workloads.
DevOps can choose other programming languages to interact with your data besides SQL, like Python. You save time with codeless data integration and take advantage of their data lakes that allow you to access data from relational and nonrelational databases. For more features and abilities see their main page here.
When it comes to pricing, Azure and AWS are both usage-based and can run into other costs by using related services. Azure Synapses pricing model isn’t directly based on query costs. You can prepay for synapse commit units (SCUs) which represent your synapse workloads. Teams are able to buy the SCUs in different tiers.
Their Tier 1 is $4,700, and their highest one is Tier 6 for $259,200. For their full pricing list go here, and for more information on SCUs go here. In comparison with AWS Athena, Azure Synapse appears way more expensive.
It’s important to keep in mind that you are purchasing large amounts of data upfront, whether it lasts a year or two is completely up to your workflow. With AWS Athena, you only get the costs for Athena itself and not combined with Amazon S3 or any other services you may end up needing. The pricing for AWS Athena is also highly dependent on how big your database is and how much data is scanned each time so it can vary greatly.
It’s possible you could be saving more on either one, but Azure Synapse is most likely going to be more expensive. This is because all the main features of the software such as data integration, and data warehousing, all have their own separate pricing. If you go to their pricing calculator, you will find that it's pre-populated, including separate pricing for the other services mentioned. The total cost for Azure Synapse with related services comes to around $19,256.81.
These are incredibly powerful services so the cost does make sense. For AWS Athena you only get the estimated cost of the queries and other special capabilities require other cloud services. Azure includes access to the other services, while AWS requires you to integrate them later.
For a more direct estimate, you can play with their respective pricing calculators as well as contact their experts for a personalized quote. You can compare their user reviews right now by going to our comparison page here.
For those that have used any of the platforms discussed here please leave a review to help other buyers make informed decisions.
A cloud computing platform like AWS will have super-dense terminology discussion around services. We provided a quick reference below for some terms that might be unfamiliar to you.
Structured Query Language (SQL)
SQL is a programming language used to interact with relational databases. You use the code to pull specific data from the database for analysis, reporting, and more. You write queries in the language to get specific data and metadata from your database tables. This can be names, contact information, and special dates. You can pull from any field in any table.
A query is the request used to pull data from the database. It’s a code sequence, usually in the form of SQL. Query prompts are essentially asking for the data in the fields of your tables. Queries are used in relational databases because you can only query for interconnected data.
What’s the difference between MB, GB, TB, and PB?
A byte digital unit of measurement for data storage and memory. One megabyte MB is a million bytes of data. One gigabyte (GB) is one billion bytes. A terabyte (TB) contains a trillion bytes, and a petabyte PB is one thousand terabytes.
Interactive Query Service
An interactive query service offers an easy interface to enter requests for information and translates them into SQL commands. It requires less effort than manually coding in an open-source database. For more insight see this correspondence on StackOverflow here.
Extract Transform Load (ETL)
ETL tools choose to transform your data before it's loaded into storage or a database. This process can take longer depending on the application.
A data lake is literally like a lake of raw data. It’s usually stored in a system or repository.
A compute service offers a team and server, as well as storage or API to send their workload to a virtual machine. A virtual machine is its own separate computing environment that you don’t have to risk running on your own operating system. In there you can build and work on a variety of software. For more information see Amazon’s definition here, and the full list of AWS compute services here.
Serverless in plain terms means you don’t have to run the program on your own server or have to manage your own server. A serverless service handles the maintenance and scaling of the software for you. For more information about serverless tech on AWS go here.
The schema of a database represents the structure. The term schema often changes depending on the discipline you see it in.
For relational databases, it's the blueprint of the database. This makes sense considering the word schematics means a diagram used for mapping the relations of electrical circuits. For more information about schemas see TechTarget’s definition here.
Presto AKA PrestoDB
PrestoDB is an open-source query engine. It’s an incredibly powerful and fast software that can run analytical queries. Created by the Facebook team, it was open-sourced with an Apache Software License. For more information about Presto, and why AWS loves this particular query engine go here.