Airbyte & Apache Iceberg with AWS Athena #45620
-
I used Airbyte to set up a demo, moving data from a PostgreSQL database into Apache Iceberg. Airbyte correctly creates the folder structure in S3, but it doesn’t automatically create a database and tables in AWS Athena. For demo purposes, I manually created a catalog in Athena and a database named airbyte. Then I ran the following query to create a table: CREATE TABLE IF NOT EXISTS `airbyte`.`airbyte_raw_device_check` (
`id` int,
`device_id` int,
`created_at` timestamp,
`external_key` string
)
LOCATION 's3://bespot-airbyte-test/iceberg-airbyte/default/airbyte_raw_device_check/'
TBLPROPERTIES (
'table_type' = 'ICEBERG',
'format' = 'parquet',
'write_compression' = 'SNAPPY'
); When I tried to query the data with: SELECT * FROM airbyte_raw_device_check LIMIT 10; I got no results back. What could I be missing or doing wrong? Any ideas? Folder structure: iceberg-airbyte/ |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
It sounds like you're doing most things right, but there are a few areas to check that might explain why you're not seeing results. First, make sure there are actual Parquet files in the S3 folder where the table is pointing. If the Another thing to look into is S3 permissions. If Athena doesn't have the right access to your S3 bucket due to permissions issues, that could be blocking the query from working. Sometimes, either the bucket policy or object permissions need tweaking. If your Iceberg table is partitioned, you may also need to refresh the partitions in Athena, which you can do with the command Finally, since you're using Iceberg, make sure Athena is set up properly to handle Iceberg tables. Double-check that the table properties are aligned with what Iceberg expects. Once you've gone through these checks, try querying again, and it should work! |
Beta Was this translation helpful? Give feedback.
-
Thank you for the helpful comment! It turned out to be a combination of permissions and a partition issue, just as you mentioned. After adjusting the S3 permissions and refreshing the partitions, it's all working now. Thanks so much for the guidance! |
Beta Was this translation helpful? Give feedback.
It sounds like you're doing most things right, but there are a few areas to check that might explain why you're not seeing results. First, make sure there are actual Parquet files in the S3 folder where the table is pointing. If the
data/
folder underairbyte_raw_device_check
is empty, or if the files are missing, Athena won't return any data. Also, check that themetadata/
folder contains the necessary manifest files. These are crucial for Iceberg tables to reference the data correctly.Another thing to look into is S3 permissions. If Athena doesn't have the right access to your S3 bucket due to permissions issues, that could be blocking the query from working. Sometimes, either the buck…