Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and we pay only for the queries that we run.
Athena is easy to use. Simply point to our data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, thereโs no need for complex ETL jobs to prepare our data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.
Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing us to create a unified metadata repository across various services, crawl data sources to discover schemas and populate our Catalog with new and modified table and partition definitions, and maintain schema versioning.
Benefits
Start querying instantly
Athena is serverless. We can quickly query our data without having to setup and manage any servers or data warehouses. Just point to data in Amazon S3, define the schema, and start querying using the built-in query editor. Amazon Athena allows us to tap into all our data in S3 without the need to set up complex processes to extract, transform, and load the data (ETL).
Pay per query
With Amazon Athena, we pay only for the queries that we run. We are charged $5 per terabyte scanned by our queries. We can save from 30% to 90% on our per-query costs and get better performance by compressing, partitioning, and converting our data into columnar formats. Athena queries data directly in Amazon S3. There are no additional storage charges beyond S3.
Open, powerful, standard
Amazon Athena uses Presto with ANSI SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet. Athena is ideal for quick, ad-hoc querying but it can also handle complex analysis, including large joins, window functions, and arrays. Amazon Athena is highly available; and executes queries using compute resources across multiple facilities and multiple devices in each facility. Amazon Athena uses Amazon S3 as its underlying data store, making our data highly available and durable.
Fast, really fast
With Amazon Athena, we don’t have to worry about having enough compute resources to get fast, interactive query performance. Amazon Athena automatically executes queries in parallel, so most results come back within seconds.
Thank you for reading this article, I really appreciate it. If you have any question feel free to leave a comment.