athenadef

Schema management for AWS Athena - a CLI tool for managing Athena table schemas as code.

Features

DDL as Code: Manage Athena table definitions using SQL files in a Git-friendly directory structure
Change Preview: See exactly what will change before applying (similar to Terraform plan)
Export Capability: Export existing tables to local SQL files
Target Filtering: Apply changes to specific tables or databases using flexible patterns
Parallel Execution: Fast operations with concurrent query execution
CI/CD Ready: GitHub Action available for automated deployments

Installation

Homebrew

brew install rieshia/x/athenadef

Binary Download

Download pre-compiled binaries from the release page.

From Source

cargo install --git https://github.com/riseshia/athenadef

GitHub Action

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
      - uses: riseshia/athenadef@v0
        with:
          version: v0.1.0 # or latest

Quick Start

Export existing tables (optional, to get started with existing infrastructure):

athenadef export

This creates SQL files for all your existing tables.

Review changes:

athenadef plan

Apply changes:

athenadef apply

Commands

Global Options

Available for all commands:

-c, --config <FILE>      Config file path [default: athenadef.yaml]
-t, --target <TABLES>    Filter tables using <database>.<table> format
    --debug              Enable debug logging
-h, --help               Print help information
-V, --version            Print version information

`plan` - Preview Changes

Show what changes will be made to match your local configuration:

athenadef plan [OPTIONS]

Options:

--show-unchanged: Show tables with no changes

Example output:

Plan: 2 to add, 1 to change, 0 to destroy.

+ salesdb.new_customers
  Will create table

~ marketingdb.leads
  Will update table
--- remote: marketingdb.leads
+++ local:  marketingdb.leads
 CREATE EXTERNAL TABLE leads (
-    score int,
+    score double,
+    created_at timestamp,
     email string
 )

`apply` - Apply Changes

Apply the changes to make your Athena tables match your local configuration:

athenadef apply [OPTIONS]

Options:

-a, --auto-approve: Skip interactive approval
--dry-run: Show what would be done without executing

⚠️ Important: Table modifications are performed using DROP TABLE followed by CREATE TABLE. This means tables will be temporarily unavailable during the update process

Example output:

Plan: 2 to add, 1 to change, 0 to destroy.

Do you want to perform these actions? (yes/no): yes

salesdb.new_customers: Creating...
salesdb.new_customers: Creation complete

marketingdb.leads: Modifying...
marketingdb.leads: Modification complete

Apply complete! Resources: 1 added, 1 changed, 0 destroyed.

`export` - Export Table Definitions

Export existing Athena table definitions to local SQL files:

athenadef export [OPTIONS]

Options:

--overwrite: Overwrite existing files

Example output:

Exporting table definitions...

salesdb.customers: Exported to salesdb/customers.sql
salesdb.orders: Exported to salesdb/orders.sql

Export complete! 2 tables exported.

Target Filtering

Use --target to filter operations to specific tables or databases:

# Specific table
athenadef plan --target salesdb.customers

# Multiple tables
athenadef plan --target salesdb.customers --target marketingdb.leads

# All tables in a database
athenadef plan --target salesdb.*

# Tables with same name across databases
athenadef plan --target *.customers

Configuration

Directory Structure

Organize your SQL files in a directory structure that mirrors your databases and tables:

project-root/
├── athenadef.yaml        # Configuration file
├── salesdb/             # Database name
│   ├── customers.sql    # Table definition
│   └── orders.sql
└── marketingdb/
    ├── leads.sql
    └── campaigns.sql

SQL Files

Each .sql file should contain a complete CREATE EXTERNAL TABLE statement:

-- customers.sql
CREATE EXTERNAL TABLE customers (
    customer_id bigint,
    name string,
    email string COMMENT 'Customer email address',
    registration_date date
)
PARTITIONED BY (
    year string,
    month string
)
STORED AS PARQUET
LOCATION 's3://your-data-bucket/customers/'
TBLPROPERTIES (
    'projection.enabled' = 'true',
    'projection.year.type' = 'integer',
    'projection.year.range' = '2020,2030',
    'projection.month.type' = 'integer',
    'projection.month.range' = '1,12',
    'projection.month.digits' = '2'
);

Configuration File

Create an athenadef.yaml file in your project root:

# athenadef.yaml

# Optional: Athena workgroup (default: "primary")
workgroup: "primary"

# Optional: S3 location for query results
# If not specified, uses workgroup's default output location (recommended)
# output_location: "s3://athena-results-bucket/athenadef/"

# Optional: List of databases to manage
# If specified and --target is not provided, only these databases will be processed
# This is useful to avoid scanning all databases in your account
# databases:
#   - salesdb
#   - marketingdb

# Optional: AWS region (uses default from AWS config if not specified)
# region: "us-west-2"

# Optional: Query timeout in seconds (default: 300)
# query_timeout_seconds: 600

# Optional: Maximum concurrent queries (default: 5)
# max_concurrent_queries: 10

Examples

See the examples directory for complete sample projects:

examples/basic - Simple setup with a few tables
examples/partitioned - Tables with partitions and partition projection
examples/multi-database - Multiple databases with many tables

IAM Permissions

Minimum Permissions (with AWS Managed Storage)

When using AWS managed storage (default, no output_location specified), you only need these permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "athena:StartQueryExecution",
        "athena:GetQueryExecution",
        "athena:GetQueryResults",
        "athena:StopQueryExecution"
      ],
      "Resource": "arn:aws:athena:*:*:workgroup/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:CreateTable",
        "glue:UpdateTable",
        "glue:DeleteTable"
      ],
      "Resource": "*"
    }
  ]
}

Additional S3 Permissions

Only required when specifying output_location in your configuration:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::your-query-results-bucket",
        "arn:aws:s3:::your-query-results-bucket/*"
      ]
    }
  ]
}

References:

How It Works

athenadef uses a simple but effective approach:

Reads local SQL files organized in a database/table.sql structure
Fetches current state from AWS Athena using SHOW CREATE TABLE
Compares definitions using text-based diff (like git diff)
Delegates SQL validation to AWS Athena (no local parsing)
Applies changes by executing DDL statements through Athena

This design ensures:

Simplicity: No complex SQL parsing
Compatibility: Supports all Athena features automatically
Reliability: SQL validation by AWS Athena itself

Troubleshooting

Common Issues

Configuration file not found:

# Specify config file explicitly
athenadef plan --config path/to/athenadef.yaml

AWS authentication errors:

# Check AWS credentials
aws sts get-caller-identity

# Set AWS profile
export AWS_PROFILE=your-profile

SQL syntax errors: SQL errors are reported by Athena and include the file name and query that failed. Check the SQL syntax in your .sql files.

Debug Mode

Enable debug logging to see detailed execution information:

athenadef plan --debug

For more detailed troubleshooting help, see the Troubleshooting Guide.

Documentation

Guides

Troubleshooting Guide - Detailed solutions for common issues
AWS Permissions Guide - Complete IAM permissions reference and examples
Advanced Usage - Advanced patterns, CI/CD integration, multi-environment setup
FAQ - Frequently asked questions

Technical Documentation

Architecture - System architecture and design
Specification - Complete technical specification
Technical Design - Core algorithms and implementation details
JSON Output - JSON output format specification

Examples

Basic Example - Simple setup with a few tables
Partitioned Tables - Partition projection and time-series data
Multi-Database - Managing multiple databases

Contributing

Contributions are welcome! Please see our contributing guidelines for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

This project includes software developed by:

aws-sdk-config: Licensed under the Apache License, Version 2.0
aws-sdk-athena: Licensed under the Apache License, Version 2.0
aws-sdk-s3: Licensed under the Apache License, Version 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.claude		.claude
.github		.github
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
action.yml		action.yml

License

riseshia/athenadef

Folders and files

Latest commit

History

Repository files navigation

athenadef

Features

Installation

Homebrew

Binary Download

From Source

GitHub Action

Quick Start

Commands

Global Options

plan - Preview Changes

apply - Apply Changes

export - Export Table Definitions

Target Filtering

Configuration

Directory Structure

SQL Files

Configuration File

Examples

IAM Permissions

Minimum Permissions (with AWS Managed Storage)

Additional S3 Permissions

How It Works

Troubleshooting

Common Issues

Debug Mode

Documentation

Guides

Technical Documentation

Examples

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Uh oh!

Languages

`plan` - Preview Changes

`apply` - Apply Changes

`export` - Export Table Definitions

Packages