diff --git a/README.md b/README.md index 8ea337ff..41e8ae87 100644 --- a/README.md +++ b/README.md @@ -14,9 +14,7 @@ -[AutoGluon](https://auto.gluon.ai/stable/index.html) is an open-source AutoML library that trains state-of-the-art ML models on tabular, time-series, and multimodal data with just a few lines of code. AutoGluon-Cloud takes that same API and runs it on AWS — train models and serve predictions on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) without managing infrastructure or setting up a heavyweight ML environment on your local machine. - -It supports two workflows: +AutoGluon-Cloud lets you train and deploy state-of-the-art ML models in the cloud in a few lines of code. Run [AutoGluon](https://auto.gluon.ai/stable/index.html) on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) without worrying about infrastructure, dependencies, or a heavy local ML environment. It supports two workflows: - **[Train your own predictor](https://auto.gluon.ai/cloud/stable/tutorials/predictor-tabular.html)** — the same `fit → deploy → predict` workflow as local AutoGluon, with all the heavy lifting offloaded to SageMaker. - **[Run pretrained foundation models](https://auto.gluon.ai/cloud/stable/tutorials/foundation-model-timeseries.html)** — deploy state-of-the-art pretrained models like [Chronos-2](https://huggingface.co/amazon/chronos-2) for zero-shot inference, with no training required. @@ -27,7 +25,7 @@ It supports two workflows: pip install autogluon.cloud ``` -Then provision the IAM role and S3 bucket AutoGluon-Cloud needs on AWS: +Then provision the IAM role and S3 bucket AutoGluon-Cloud needs to run on AWS: ```python from autogluon.cloud import bootstrap @@ -39,6 +37,8 @@ See the [Setup tutorial](https://auto.gluon.ai/cloud/stable/tutorials/setup.html ## ⚙️ Train your own model +Train an AutoGluon predictor on your data and serve it from a SageMaker endpoint — same API as local AutoGluon, all heavy lifting on AWS. Full walkthrough: [tabular](https://auto.gluon.ai/cloud/stable/tutorials/predictor-tabular.html), [time series](https://auto.gluon.ai/cloud/stable/tutorials/predictor-timeseries.html). + ```python from autogluon.cloud import TabularCloudPredictor @@ -65,6 +65,8 @@ result = cloud_predictor.predict(test_data) ## 🚀 Run a pretrained foundation model +Skip training entirely — deploy a pretrained model like Chronos-2 to SageMaker and get zero-shot predictions out of the box. Full walkthrough: [time series](https://auto.gluon.ai/cloud/stable/tutorials/foundation-model-timeseries.html). + ```python from autogluon.cloud import TimeSeriesFoundationModel diff --git a/docs/index.md b/docs/index.md index daad9c90..65de5a97 100644 --- a/docs/index.md +++ b/docs/index.md @@ -34,13 +34,19 @@ Train and Deploy AutoGluon in the Cloud :::::: -[AutoGluon](https://auto.gluon.ai/stable/index.html) is an open-source AutoML library that trains state-of-the-art ML models on tabular, time-series, and multimodal data with just a few lines of code. AutoGluon-Cloud takes that same API and runs it on AWS — train models and serve predictions on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) without managing infrastructure or setting up a heavyweight ML environment on your local machine. - -It supports two workflows: +AutoGluon-Cloud lets you train and deploy state-of-the-art ML models in the cloud in a few lines of code. Run [AutoGluon](https://auto.gluon.ai/stable/index.html) on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) without worrying about infrastructure, dependencies, or a heavy local ML environment. It supports two workflows: - **[Train your own predictor](tutorials/predictor-tabular.md)** — the same `fit → deploy → predict` workflow as local AutoGluon, with all the heavy lifting offloaded to SageMaker. - **[Run pretrained foundation models](tutorials/foundation-model-timeseries.md)** — deploy state-of-the-art pretrained models like [Chronos-2](https://huggingface.co/amazon/chronos-2) for zero-shot inference, with no training required. +## {octicon}`package` Installation + +```bash +pip install autogluon.cloud +``` + +Before running any of the snippets below, follow the [Setup tutorial](tutorials/setup.md) to register the IAM role and S3 bucket that SageMaker will use. + ## {octicon}`gear` Train AutoGluon predictors in the cloud Full walkthrough: [Tabular](tutorials/predictor-tabular.md), [Time Series](tutorials/predictor-timeseries.md). @@ -146,17 +152,6 @@ endpoint.delete_endpoint() ::: -## {octicon}`package` Installation - -[![PyPI](https://img.shields.io/pypi/v/autogluon.cloud.svg)](https://pypi.org/project/autogluon.cloud/) -[![Python Versions](https://img.shields.io/pypi/pyversions/autogluon.cloud)](https://pypi.org/project/autogluon.cloud/) - -```bash -pip install autogluon.cloud -``` - -Before running the examples above, set up your AWS resources (IAM role + S3 bucket) by following the [Setup](tutorials/setup.md) tutorial. - ```{toctree} --- caption: Tutorials diff --git a/docs/tutorials/foundation-model-timeseries.md b/docs/tutorials/foundation-model-timeseries.md index 4d3d6c94..0e738718 100644 --- a/docs/tutorials/foundation-model-timeseries.md +++ b/docs/tutorials/foundation-model-timeseries.md @@ -17,33 +17,19 @@ That makes the workflow much simpler than [training your own time series predict AutoGluon-Cloud exposes this workflow through {py:class}`~autogluon.cloud.TimeSeriesFoundationModel`. For now it covers time series forecasting only, with models like Chronos-2 available out of the box. -```{attention} -SageMaker compute and S3 storage are billed to your AWS account. AutoGluon-Cloud is a free wrapper, but it's your responsibility to monitor usage and delete endpoints when no longer needed. -``` - ## Create the model -A {py:class}`~autogluon.cloud.TimeSeriesFoundationModel` needs an IAM execution role (so SageMaker can run jobs on your behalf) and an S3 bucket (to stage data and store outputs). There are two ways to supply them: - -- Use a saved config (recommended). Save the role and bucket once to `~/.autogluon/cloud.yaml` — see [Setup](./setup.md) — and subsequent constructor calls will pick them up automatically: - - ```python - from autogluon.cloud import TimeSeriesFoundationModel - - model = TimeSeriesFoundationModel(model_id="chronos-2") - ``` +```{important} +Before running any code below, follow the [Setup tutorial](./setup.md) to register the IAM role and S3 bucket that SageMaker will use. The examples assume those resources are saved in `~/.autogluon/cloud.yaml`. +``` -- Pass them at construction. Useful when you need different roles or buckets per call: +```python +from autogluon.cloud import TimeSeriesFoundationModel - ```python - model = TimeSeriesFoundationModel( - model_id="chronos-2", - role="arn:aws:iam::222222222222:role/MyAutoGluonRole", - cloud_output_path="s3://my-autogluon-bucket/ag-foundation-model", - ) - ``` +model = TimeSeriesFoundationModel(model_id="chronos-2") +``` -The examples in the rest of this tutorial reuse a single `model` object created this way. +The rest of the tutorial reuses this `model` object. ### Available models @@ -160,6 +146,96 @@ The endpoint stays active — and billed — until you delete it: endpoint.delete_endpoint() ``` +### Invoke the endpoint without AutoGluon-Cloud + +The deployed endpoint is a normal SageMaker endpoint, so you can invoke it from any AWS SDK. Unlike the trained-predictor case, foundation model endpoints always need per-request inference settings (`prediction_length`, `freq`, `target`, etc.) bundled into the payload — so plain CSV is not supported. Pick one of the two structured payload formats below. + +:::{dropdown} Payload formats — boto3 examples +:animate: fade-in-slide-down +:color: secondary + +**Option 1: AutoGluon-Cloud's native `application/x-autogluon` envelope.** Each DataFrame is serialized as base64-encoded parquet, with inference settings carried in `inference_kwargs`. This is what {py:meth}`autogluon.cloud.TimeSeriesEndpoint.predict` sends under the hood: + +```python +import base64 +import io +import json +import boto3 +import pandas as pd + +def df_to_b64(df: pd.DataFrame) -> str: + return base64.b64encode(df.to_parquet()).decode("ascii") + +data = pd.read_parquet("https://autogluon.s3.amazonaws.com/datasets/timeseries/retail_sales/train.parquet") +known_covariates = ( + pd.read_parquet("https://autogluon.s3.amazonaws.com/datasets/timeseries/retail_sales/test.parquet") + .drop(columns=["Sales"]) +) + +payload = { + "version": 1, + "data": df_to_b64(data), + "known_covariates": df_to_b64(known_covariates), + "inference_kwargs": { + "prediction_length": 13, + "target": "Sales", + "id_column": "id", + "timestamp_column": "timestamp", + }, +} + +client = boto3.client("sagemaker-runtime") +response = client.invoke_endpoint( + EndpointName=ENDPOINT_NAME, + ContentType="application/x-autogluon", + Accept="application/x-parquet", + Body=json.dumps(payload).encode("utf-8"), +) +forecasts = pd.read_parquet(io.BytesIO(response["Body"].read())) +``` + +**Option 2: Per-item JSON.** Each item is a JSON object with its target history and, optionally, past and future values of covariates inline. This is the same payload schema used by [Chronos-2 on SageMaker JumpStart](https://github.com/amazon-science/chronos-forecasting/blob/v2.2.2/notebooks/deploy-chronos-to-amazon-sagemaker.ipynb), so it's a drop-in if you already have code talking to a JumpStart endpoint: + +```python +import io +import json +import boto3 +import pandas as pd + +payload = { + "inputs": [ + { + "item_id": "store_1", + "start": "2014-01-05", # ISO timestamp of the first target value + "target": [123.0, 145.0, 167.0, ...], # historical target values + "past_covariates": { # past covariate values (same length as target) + "Promo": [0, 1, 0, ...], + "SchoolHoliday": [0, 0, 1, ...], + }, + "future_covariates": { # future values over the forecast horizon (length = prediction_length) + "Promo": [1, 0, ..., 1], + "SchoolHoliday": [0, 1, ..., 0], + }, + }, + # ... one entry per item + ], + "parameters": { + "prediction_length": 13, + "freq": "W", # required when "start" is set + }, +} + +client = boto3.client("sagemaker-runtime") +response = client.invoke_endpoint( + EndpointName=ENDPOINT_NAME, + ContentType="application/json", + Accept="application/x-parquet", + Body=json.dumps(payload).encode("utf-8"), +) +forecasts = pd.read_parquet(io.BytesIO(response["Body"].read())) +``` +::: + ### Reattaching to an existing endpoint To send requests to an endpoint that's already running (e.g. from a previous session, or one a diff --git a/docs/tutorials/predictor-tabular.md b/docs/tutorials/predictor-tabular.md index 65e0dcd1..7a1af0e2 100644 --- a/docs/tutorials/predictor-tabular.md +++ b/docs/tutorials/predictor-tabular.md @@ -1,39 +1,26 @@ # Train and Deploy a Tabular Predictor on Amazon SageMaker -```{tip} +```{note} This tutorial covers tabular classification and regression. For time series forecasting, see [Train a Time Series Predictor](./predictor-timeseries.md). ``` AutoGluon-Cloud lets you train, deploy, and run inference with AutoGluon tabular predictors on AWS using the same APIs you'd use locally. Under the hood, it runs your jobs on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) using AWS's official [AutoGluon deep learning containers](https://aws.github.io/deep-learning-containers/reference/available_images/#autogluon-training) — so you don't manage any infrastructure yourself. -```{attention} -SageMaker compute and S3 storage are billed to your AWS account. AutoGluon-Cloud is a free wrapper, but it's your responsibility to monitor usage to avoid unexpected charges. -``` - ## Training -**Create the predictor.** A {py:class}`~autogluon.cloud.TabularCloudPredictor` needs an IAM execution role (so SageMaker can run jobs on your behalf) and an S3 bucket (to stage data and store trained artifacts). There are two ways to supply them: - -- Use a saved config (recommended). Save the role and bucket once to `~/.autogluon/cloud.yaml` — see [Setup](setup.md) — and subsequent constructor calls will pick them up automatically: - - ```python - from autogluon.cloud import TabularCloudPredictor - - cloud_predictor = TabularCloudPredictor() - ``` +```{important} +Before running any code below, follow the [Setup tutorial](setup.md) to register the IAM role and S3 bucket that SageMaker will use. The examples assume those resources are saved in `~/.autogluon/cloud.yaml`. +``` -- Pass them at construction. Useful when you need different roles or buckets per call: +Create the predictor: - ```python - cloud_predictor = TabularCloudPredictor( - role="arn:aws:iam::222222222222:role/MyAutoGluonRole", - cloud_output_path="s3://my-autogluon-bucket/tabular-demo", - ) - ``` +```python +from autogluon.cloud import TabularCloudPredictor -**Train.** {py:meth}`autogluon.cloud.TabularCloudPredictor.fit` runs [`TabularPredictor.fit()`](https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.fit.html) inside a remote SageMaker job — along with `train_data`, the `predictor_init_args` and `predictor_fit_args` are forwarded straight through. Training, model artifacts, and AutoGluon itself all live on the remote instance, so you don't need AutoGluon installed locally. +cloud_predictor = TabularCloudPredictor() +``` -`train_data` can be a pandas DataFrame, or a path to a local or S3 file (CSV or Parquet). In every case AutoGluon-Cloud loads the data locally and uploads it to your `cloud_output_path` bucket before kicking off the SageMaker job. +{py:meth}`TabularCloudPredictor.fit() ` runs [`TabularPredictor.fit()`](https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.fit.html) inside a remote SageMaker job — along with `train_data`, the `predictor_init_args` and `predictor_fit_args` are forwarded straight through. Training, model artifacts, and AutoGluon itself all live on the remote instance, so you don't need AutoGluon installed locally. ```python cloud_predictor.fit( @@ -44,6 +31,8 @@ cloud_predictor.fit( ) ``` +`train_data` can be a pandas DataFrame, or a path to a local or S3 file (CSV or Parquet). In every case AutoGluon-Cloud loads the data locally and uploads it to your `cloud_output_path` bucket before kicking off the SageMaker job. + ### Reattach to a training job If your local connection drops, the training job keeps running on SageMaker. You can reattach with another `CloudPredictor` via {py:meth}`~autogluon.cloud.TabularCloudPredictor.attach_job` as long as you have the job name — it's logged when training starts (`INFO:sagemaker:Creating training-job with name: ag-cloudpredictor-...`) and also visible in the SageMaker console. diff --git a/docs/tutorials/predictor-timeseries.md b/docs/tutorials/predictor-timeseries.md index 38098897..453733fe 100644 --- a/docs/tutorials/predictor-timeseries.md +++ b/docs/tutorials/predictor-timeseries.md @@ -1,39 +1,26 @@ # Train and Deploy a Time Series Predictor on Amazon SageMaker -```{tip} +```{note} This tutorial covers time series forecasting. For tabular classification/regression, see [Train a Tabular Predictor](./predictor-tabular.md). ``` AutoGluon-Cloud lets you train, deploy, and run inference with AutoGluon time series predictors on AWS using the same APIs you'd use locally. Under the hood, it runs your jobs on [Amazon SageMaker](https://aws.amazon.com/sagemaker/) using AWS's official [AutoGluon deep learning containers](https://aws.github.io/deep-learning-containers/reference/available_images/#autogluon-training) — so you don't manage any infrastructure yourself. -```{attention} -SageMaker compute and S3 storage are billed to your AWS account. AutoGluon-Cloud is a free wrapper, but it's your responsibility to monitor usage to avoid unexpected charges. -``` - ## Training -**Create the predictor.** A {py:class}`~autogluon.cloud.TimeSeriesCloudPredictor` needs an IAM execution role (so SageMaker can run jobs on your behalf) and an S3 bucket (to stage data and store trained artifacts). There are two ways to supply them: - -- Use a saved config (recommended). Save the role and bucket once to `~/.autogluon/cloud.yaml` — see [Setup](setup.md) — and subsequent constructor calls will pick them up automatically: - - ```python - from autogluon.cloud import TimeSeriesCloudPredictor - - cloud_predictor = TimeSeriesCloudPredictor() - ``` +```{important} +Before running any code below, follow the [Setup tutorial](setup.md) to register the IAM role and S3 bucket that SageMaker will use. The examples assume those resources are saved in `~/.autogluon/cloud.yaml`. +``` -- Pass them at construction. Useful when you need different roles or buckets per call: +Create the predictor: - ```python - cloud_predictor = TimeSeriesCloudPredictor( - role="arn:aws:iam::222222222222:role/MyAutoGluonRole", - cloud_output_path="s3://my-autogluon-bucket/timeseries-demo", - ) - ``` +```python +from autogluon.cloud import TimeSeriesCloudPredictor -**Train.** {py:meth}`autogluon.cloud.TimeSeriesCloudPredictor.fit` runs [`TimeSeriesPredictor.fit()`](https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesPredictor.fit.html) inside a remote SageMaker job — along with `train_data`, the `predictor_init_args` and `predictor_fit_args` are forwarded straight through. Training, model artifacts, and AutoGluon itself all live on the remote instance, so you don't need AutoGluon installed locally. +cloud_predictor = TimeSeriesCloudPredictor() +``` -`train_data` can be a pandas DataFrame, or a path to a local or S3 file (CSV or Parquet). The data must be in **long format** with one row per `(item_id, timestamp)` pair plus a target column. See the [Time Series Quick Start](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-quick-start.html) for the expected schema and the [Forecasting In-Depth](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-indepth.html) tutorial for an overview of the different covariate types AutoGluon supports. +{py:meth}`TimeSeriesCloudPredictor.fit() ` runs [`TimeSeriesPredictor.fit()`](https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesPredictor.fit.html) inside a remote SageMaker job — along with `train_data`, the `predictor_init_args` and `predictor_fit_args` are forwarded straight through. Training, model artifacts, and AutoGluon itself all live on the remote instance, so you don't need AutoGluon installed locally. ```python cloud_predictor.fit( @@ -48,6 +35,8 @@ cloud_predictor.fit( ) ``` +`train_data` can be a pandas DataFrame, or a path to a local or S3 file (CSV or Parquet). The data must be in **long format** with one row per `(item_id, timestamp)` pair plus a target column. See the [Time Series Quick Start](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-quick-start.html) for the expected schema and the [Forecasting In-Depth](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-indepth.html) tutorial for an overview of the different covariate types AutoGluon supports. + ### Fit and predict in a single job For workflows where fitting is light (e.g. fine-tuning a pretrained foundation model), {py:meth}`~autogluon.cloud.TimeSeriesCloudPredictor.fit_predict` runs both steps inside the same SageMaker job — saving the startup overhead of a second job. Predictions are generated against `train_data` and written to S3. @@ -106,7 +95,7 @@ Send requests to the endpoint with {py:meth}`~autogluon.cloud.TimeSeriesCloudPre ```python forecasts = cloud_predictor.predict_real_time( - "test.csv", + "train.csv", # historical observations — forecasts start from the last timestamp per item known_covariates="known_covariates.csv", # required if known_covariates_names was set static_features="static_features.csv", # optional ) @@ -127,22 +116,106 @@ cloud_predictor.cleanup_deployment() To check whether an endpoint is currently attached, call {py:meth}`~autogluon.cloud.TimeSeriesCloudPredictor.info` and look for the `endpoint` key in the returned dict. #### Invoke the endpoint without AutoGluon-Cloud -The deployed endpoint is a normal SageMaker endpoint, and you can invoke it through other methods. For example, to invoke it with boto3 directly: +The deployed endpoint is a normal SageMaker endpoint, so you can invoke it from any AWS SDK. The simplest payload is the historical observations as CSV — forecasts are generated starting from the last timestamp of each item: ```python +import io import boto3 +import pandas as pd + +train_data = pd.read_csv("train.csv") # long format with item_id, timestamp, target -client = boto3.client('sagemaker-runtime') +client = boto3.client("sagemaker-runtime") response = client.invoke_endpoint( EndpointName=ENDPOINT_NAME, - ContentType='text/csv', - Accept='application/json', - Body=test_data.to_csv() + ContentType="text/csv", + Accept="application/x-parquet", + Body=train_data.to_csv(index=False), ) +forecasts = pd.read_parquet(io.BytesIO(response["Body"].read())) +``` + +The CSV format only carries the historical observations. To pass `static_features` or `known_covariates` (required when the predictor was fit with `known_covariates_names`), use one of the structured payload formats below. + +:::{dropdown} Advanced payload formats — with static_features and known_covariates +:animate: fade-in-slide-down +:color: secondary + +**Option 1: AutoGluon-Cloud's native `application/x-autogluon` envelope.** Each DataFrame is serialized as base64-encoded parquet and bundled in a single JSON object. This is what {py:meth}`~autogluon.cloud.TimeSeriesCloudPredictor.predict_real_time` sends under the hood: + +```python +import base64 +import io +import json +import boto3 +import pandas as pd -#: Print the model endpoint's output. -print(response['Body'].read().decode()) +def df_to_b64(df: pd.DataFrame) -> str: + return base64.b64encode(df.to_parquet()).decode("ascii") + +train_data = pd.read_csv("train.csv") +known_covariates = pd.read_csv("known_covariates.csv") +static_features = pd.read_csv("static_features.csv") # optional + +payload = { + "version": 1, + "data": df_to_b64(train_data), + "known_covariates": df_to_b64(known_covariates), + "static_features": df_to_b64(static_features), + "inference_kwargs": {}, # prediction_length / quantile_levels are baked in at fit time +} + +client = boto3.client("sagemaker-runtime") +response = client.invoke_endpoint( + EndpointName=ENDPOINT_NAME, + ContentType="application/x-autogluon", + Accept="application/x-parquet", + Body=json.dumps(payload).encode("utf-8"), +) +forecasts = pd.read_parquet(io.BytesIO(response["Body"].read())) +``` + +**Option 2: Per-item JSON.** Each item is a JSON object with its target history and, optionally, past and future values of covariates inline. This is the same payload schema used by [Chronos-2 on SageMaker JumpStart](https://github.com/amazon-science/chronos-forecasting/blob/v2.2.2/notebooks/deploy-chronos-to-amazon-sagemaker.ipynb), so it's a drop-in if you already have code talking to a JumpStart endpoint: + +```python +import io +import json +import boto3 +import pandas as pd + +payload = { + "inputs": [ + { + "item_id": "store_1", + "start": "2014-01-05", # ISO timestamp of the first target value + "target": [123.0, 145.0, 167.0, ...], # historical target values + "past_covariates": { # past values of known_covariates (same length as target) + "promo": [0, 1, 0, ...], + "holiday": [0, 0, 1, ...], + }, + "future_covariates": { # future values over the forecast horizon (length = prediction_length) + "promo": [1, 0, ..., 1], + "holiday": [0, 1, ..., 0], + }, + }, + # ... one entry per item + ], + "parameters": { + "prediction_length": 24, # must match the trained predictor's prediction_length + "freq": "W", # required when "start" is set + }, +} + +client = boto3.client("sagemaker-runtime") +response = client.invoke_endpoint( + EndpointName=ENDPOINT_NAME, + ContentType="application/json", + Accept="application/x-parquet", + Body=json.dumps(payload).encode("utf-8"), +) +forecasts = pd.read_parquet(io.BytesIO(response["Body"].read())) ``` +::: ### Batch inference @@ -150,7 +223,7 @@ To score a dataset as a one-off job, use {py:meth}`~autogluon.cloud.TimeSeriesCl ```python forecasts = cloud_predictor.predict( - "test.csv", # DataFrame, local path, or S3 URL (CSV/Parquet) + "train.csv", # historical observations — DataFrame, local path, or S3 URL (CSV/Parquet) known_covariates="known_covariates.csv", # required if known_covariates_names was set static_features="static_features.csv", # optional instance_type="ml.m5.2xlarge", diff --git a/docs/tutorials/setup.md b/docs/tutorials/setup.md index cc9374e4..648c0306 100644 --- a/docs/tutorials/setup.md +++ b/docs/tutorials/setup.md @@ -1,52 +1,27 @@ # Set Up AutoGluon-Cloud on AWS -AutoGluon-Cloud trains and deploys models on AWS SageMaker on your behalf. To do that, every `CloudPredictor` or `FoundationModel` you create needs two AWS resources: - -- An **IAM role** that SageMaker assumes to run training and inference jobs. -- An **S3 bucket** to stage training data and store trained models. - -You have two options for supplying them: - -1. **Save them once** to `~/.autogluon/cloud.yaml`, and AutoGluon-Cloud will pick them up automatically on every call. This is the recommended path — set it up with [`bootstrap`](#bootstrap) or [`register`](#register) below. -2. **Pass them explicitly** to each `CloudPredictor` / `FoundationModel`, e.g. `CloudPredictor(role="arn:aws:iam::...", cloud_output_path="s3://my-bucket/...")`. Useful if you need different roles or buckets per call, or if you don't want a config file on disk. - -The rest of this page covers option 1. - -## Commands - -AutoGluon-Cloud ships four commands for managing the saved configuration: - -| Command | What it does | When to use it | -|---|---|---| -| [`bootstrap`](#bootstrap) | Provisions a role and bucket via CloudFormation, then saves them. | First-time setup with no existing AWS resources. | -| [`register`](#register) | Saves an existing role and bucket without provisioning anything. | Your platform team already gave you a role and bucket. | -| [`status`](#status) | Verifies the saved resources still exist and are accessible. | Sanity-check before training, or after IAM/S3 changes. | -| [`teardown`](#teardown) | Deletes resources created by `bootstrap` and the saved config. | Cleanup when you're done with AutoGluon-Cloud. | - -Each command is available both as a CLI subcommand (`autogluon-cloud `) and as a Python function (`from autogluon.cloud import `). The sections below show both forms. - -## Install +First, install the `autogluon.cloud` package: ```bash -pip install -U autogluon.cloud +pip install autogluon.cloud ``` -This installs the `autogluon-cloud` CLI alongside the Python API. +AutoGluon-Cloud runs training and inference on Amazon SageMaker on your behalf. Every `CloudPredictor` or `FoundationModel` you create needs two AWS resources: +- an **IAM role** that SageMaker assumes to run training and inference jobs +- an **S3 bucket** to stage data and store trained models -## `bootstrap` +```{attention} +SageMaker compute and S3 storage are billed to your AWS account. AutoGluon-Cloud is a free wrapper, but it's your responsibility to monitor usage and delete endpoints when no longer needed. +``` + +There are three ways to supply these resources — if you're unsure, start with option 1. -Provisions an IAM role and S3 bucket via CloudFormation, then saves them to `~/.autogluon/cloud.yaml`. Use this if you don't already have AWS resources for AutoGluon-Cloud. +### 1. Create new resources with {func}`~autogluon.cloud.bootstrap` -`bootstrap` uses the [standard boto3 credential resolution order](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials) to find your AWS credentials, so anything that works for the AWS CLI or boto3 will work here (`aws configure`, `AWS_*` environment variables, an active SSO session, or an instance profile). Run: +Run this if you don't yet have an IAM role and S3 bucket set up for SageMaker. The role and bucket are provisioned on your account from a {repo-file}`CloudFormation template ` and saved under `~/.autogluon/cloud.yaml` for future calls. ::::{tab-set} -:::{tab-item} CLI -:sync: setup-cli -```bash -autogluon-cloud bootstrap -``` -::: :::{tab-item} Python :sync: setup-py ```python @@ -55,29 +30,19 @@ from autogluon.cloud import bootstrap bootstrap() ``` ::: -:::: - -The CloudFormation stack is named `ag-cloud-sagemaker` by default. Subsequent `CloudPredictor` calls pick the saved values up automatically. - -```{note} -Review the CloudFormation template before deploying: {repo-file}`src/autogluon/cloud/templates/ag_cloud_sagemaker.yaml`. -``` - - -## `register` - -Tells AutoGluon-Cloud to use an IAM role and S3 bucket you already have. Use this when your platform team has provisioned them for you and you want to skip CloudFormation. - -::::{tab-set} :::{tab-item} CLI :sync: setup-cli ```bash -autogluon-cloud register \ - --role arn:aws:iam::222222222222:role/MyAutoGluonRole \ - --bucket my-autogluon-bucket \ - --region us-east-1 +autogluon-cloud bootstrap ``` ::: +:::: + +### 2. Use existing resources with {func}`~autogluon.cloud.register` + +Run this if you already have an IAM role and S3 bucket that you want to use with AutoGluon-Cloud. The values are saved under `~/.autogluon/cloud.yaml` for future calls. + +::::{tab-set} :::{tab-item} Python :sync: setup-py ```python @@ -90,61 +55,39 @@ register( ) ``` ::: -:::: - -`register` makes no AWS calls — it only persists the values to `~/.autogluon/cloud.yaml`. The IAM role must trust `sagemaker.amazonaws.com` and have permissions equivalent to AWS's [`AmazonSageMakerFullAccess`](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerFullAccess.html) managed policy plus read/write access to your bucket. - - -## `status` - -Verifies that the saved IAM role, S3 bucket, and (if applicable) CloudFormation stack still exist and are accessible. - -::::{tab-set} :::{tab-item} CLI :sync: setup-cli ```bash -autogluon-cloud status -``` -::: -:::{tab-item} Python -:sync: setup-py -```python -from autogluon.cloud import status - -reports = status() +autogluon-cloud register \ + --role arn:aws:iam::222222222222:role/MyAutoGluonRole \ + --bucket my-autogluon-bucket \ + --region us-east-1 ``` ::: :::: -`ok` means the resource exists; `ok (unverified ...)` means the caller lacks the IAM permission to verify (the resource is probably fine, but `status` couldn't confirm). - +The role must trust the `sagemaker.amazonaws.com` principal and have permissions equivalent to AWS's [`AmazonSageMakerFullAccess`](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSageMakerFullAccess.html) managed policy plus read/write access to your bucket. The `region` where the jobs are executed must match the bucket's region. -## `teardown` +### 3. Pass resources on each call -Deletes the CloudFormation stacks created by `bootstrap` and removes `~/.autogluon/cloud.yaml`. Backends added via `register` only have their config entry removed — your existing role and bucket are left untouched. +Skip the saved config entirely and provide the role and bucket every time you create a `CloudPredictor` or `FoundationModel`. -::::{tab-set} -:::{tab-item} CLI -:sync: setup-cli -```bash -autogluon-cloud teardown -``` -::: -:::{tab-item} Python -:sync: setup-py ```python -from autogluon.cloud import teardown +from autogluon.cloud import TabularCloudPredictor -teardown() +predictor = TabularCloudPredictor( + cloud_output_path="s3://my-autogluon-bucket/output", + role="arn:aws:iam::222222222222:role/MyAutoGluonRole", +) ``` -::: -:::: -```{warning} -CloudFormation refuses to delete non-empty S3 buckets. If your bucket holds training artifacts you want to discard, empty it first with `aws s3 rm s3:// --recursive`. -``` +Useful for one-off scripts or when you need different roles and buckets per call. The same role and bucket requirements as option 2 apply. + +## Managing the saved config +Once {func}`~autogluon.cloud.bootstrap` or {func}`~autogluon.cloud.register` has written to `~/.autogluon/cloud.yaml`, you may want to check that the role and bucket are still healthy before a long training run, or clean everything up when you're done with AutoGluon-Cloud. Two helper commands cover both: -## Where the config lives +- {func}`~autogluon.cloud.status` checks that the saved role and bucket still exist and are accessible — handy after IAM or S3 changes. +- {func}`~autogluon.cloud.teardown` deletes the CloudFormation stack created by {func}`~autogluon.cloud.bootstrap` and clears the saved config. Resources registered via {func}`~autogluon.cloud.register` are left untouched, since you own them. -`bootstrap` and `register` both write to `~/.autogluon/cloud.yaml`. The file is keyed by backend, so you can have separate entries for different backends side by side. Override the directory with the `AG_CONFIG_DIR` environment variable. +The config path can be overridden with the `AG_CONFIG_DIR` environment variable if you'd rather keep it somewhere other than `~/.autogluon/`. diff --git a/setup.py b/setup.py index a38bb040..6ce96935 100644 --- a/setup.py +++ b/setup.py @@ -113,16 +113,15 @@ def default_setup_args(*, version): # <2 because unlikely to introduce breaking changes in minor releases. >=1.10 because 1.10 is 3 years old, no need to support older "boto3>=1.10,<2", "packaging>=23.0,<27", - # updated sagemaker is required to fetch latest container info, so we don't want to cap the version too strict - # otherwise cloud module needs to be released to support new container "sagemaker>=2.126.0,<3", - "pyarrow>=11.0,<25", + "pyarrow>=19.0.1,<25", # lower bound to avoid https://github.com/apache/arrow/issues/45283 "PyYAML~=6.0", "Pillow>=10.2,<13", + "huggingface_hub>=0.20,<2", + "typing_extensions>=4.0,<5", # CLI dependencies (autogluon-cloud command) "click>=8.0,<9", "rich>=13.0,<15", - "huggingface_hub>=0.20,<2", ] extras_require = dict() diff --git a/src/autogluon/cloud/cloud_setup.py b/src/autogluon/cloud/cloud_setup.py index 54569409..53a901f9 100644 --- a/src/autogluon/cloud/cloud_setup.py +++ b/src/autogluon/cloud/cloud_setup.py @@ -95,6 +95,7 @@ def bootstrap( region=region, backend=backend, stack_name=stack_name, + session=session, ) @@ -105,6 +106,7 @@ def register( region: str, backend: BackendName = "sagemaker", stack_name: Optional[str] = None, + session: Optional[boto3.Session] = None, ) -> None: """Persist resource identifiers to ``~/.autogluon/cloud.yaml`` under the given backend key. @@ -129,6 +131,8 @@ def register( Optional CloudFormation stack name. If you deployed the resources via your own CFN stack and want :func:`teardown` to be able to delete it later, pass the name here. Defaults to ``None``, meaning teardown will only remove the config entry, not touch AWS. + session + ``boto3.Session`` used to verify the bucket region. If ``None``, the default ambient session is used. """ if backend not in SUPPORTED_BACKENDS: raise ValueError(f"Unsupported backend {backend!r}. Choose from {SUPPORTED_BACKENDS}.") @@ -138,6 +142,7 @@ def register( f"`bucket` must be a bare bucket name without prefixes (got {bucket!r}). " "Pass prefixes via `cloud_output_path=` on the predictor/model instead." ) + _validate_bucket_region(session=session or boto3.Session(), bucket=bucket, region=region) config = load_config() or CloudConfig() config.backends[backend] = BackendConfig( region=region, @@ -308,6 +313,32 @@ def _is_permission_error(e: ClientError) -> bool: } +def _validate_bucket_region(*, session: boto3.Session, bucket: str, region: str) -> None: + """Raise if the bucket is in a different region than ``region``. Silently skips if the bucket + region can't be determined (missing bucket, network issues, etc.). + + Cross-region ``head_bucket`` calls return 403 even when the caller lacks ``s3:HeadBucket`` + permission, but the response still carries the ``x-amz-bucket-region`` header — so we read it + from the error path too, otherwise the very mismatch this function exists to catch slips through + whenever the caller's role is locked down. + """ + try: + response = session.client("s3").head_bucket(Bucket=bucket) + bucket_region = response["ResponseMetadata"]["HTTPHeaders"].get("x-amz-bucket-region") + except ClientError as e: + bucket_region = e.response.get("ResponseMetadata", {}).get("HTTPHeaders", {}).get("x-amz-bucket-region") + except BotoCoreError: + return + if not bucket_region: + return + if bucket_region != region: + raise ValueError( + f"Bucket {bucket!r} is in region {bucket_region!r}, but you registered it under {region!r}. " + "SageMaker requires the bucket and the job region to match. Either pass `--region " + f"{bucket_region}` (and run jobs there), or pick a bucket in {region!r}." + ) + + def _check_bucket(session: boto3.Session, bucket: str) -> str: try: session.client("s3").head_bucket(Bucket=bucket) diff --git a/src/autogluon/cloud/endpoint/timeseries_endpoint.py b/src/autogluon/cloud/endpoint/timeseries_endpoint.py index a3d05698..a2451510 100644 --- a/src/autogluon/cloud/endpoint/timeseries_endpoint.py +++ b/src/autogluon/cloud/endpoint/timeseries_endpoint.py @@ -2,11 +2,11 @@ import boto3 import pandas as pd -import sagemaker from sagemaker.predictor import Predictor from autogluon.common.loaders import load_pd +from ..utils.aws_utils import setup_sagemaker_session from ..utils.deserializers import PandasDeserializer from ..utils.serializers import AutoGluonSerializationWrapper, AutoGluonSerializer @@ -31,11 +31,9 @@ def __init__(self, endpoint_name: str, session: Optional[boto3.Session] = None): session ``boto3.Session`` used to invoke and delete the endpoint. If ``None``, the default ambient session is used. """ - boto_session = session or boto3.Session() - sagemaker_session = sagemaker.Session(boto_session=boto_session) self._predictor = Predictor( endpoint_name=endpoint_name, - sagemaker_session=sagemaker_session, + sagemaker_session=setup_sagemaker_session(boto_session=session), serializer=AutoGluonSerializer(), deserializer=PandasDeserializer(), ) @@ -54,7 +52,6 @@ def predict( id_column: str = "item_id", timestamp_column: str = "timestamp", quantile_levels: Optional[List[float]] = None, - accept: str = "application/x-parquet", ) -> pd.DataFrame: """ Run real-time prediction on the deployed endpoint. @@ -80,8 +77,6 @@ def predict( quantile_levels List of increasing decimals between 0 and 1 specifying which quantiles to estimate. Defaults to ``[0.1, 0.2, ..., 0.9]``. - accept - Response format. Options: 'application/x-parquet', 'text/csv', 'application/json'. Returns ------- @@ -109,7 +104,7 @@ def predict( static_features=static_features, known_covariates=known_covariates, ) - return self._predictor.predict(payload, initial_args={"Accept": accept}) + return self._predictor.predict(payload, initial_args={"Accept": "application/x-parquet"}) def delete_endpoint(self) -> None: """Delete the endpoint and its backing model + endpoint config.""" diff --git a/src/autogluon/cloud/model/foundation_model.py b/src/autogluon/cloud/model/foundation_model.py index 0863200b..4de2c200 100644 --- a/src/autogluon/cloud/model/foundation_model.py +++ b/src/autogluon/cloud/model/foundation_model.py @@ -11,6 +11,7 @@ from typing import Any, Dict, List, Literal, Optional, Union import pandas as pd +from typing_extensions import Self from autogluon.common.utils.s3_utils import s3_path_to_bucket_prefix @@ -60,7 +61,7 @@ class FoundationModel: _backend_map: Dict[str, str] = {} _predictor_type: str - def __new__(cls, model_id: str, **kwargs) -> "FoundationModel": + def __new__(cls, model_id: str, **kwargs) -> Self: if cls is not FoundationModel: return super().__new__(cls) config = get_model_config(model_id) diff --git a/src/autogluon/cloud/utils/aws_utils.py b/src/autogluon/cloud/utils/aws_utils.py index 8b9a276a..8a889188 100644 --- a/src/autogluon/cloud/utils/aws_utils.py +++ b/src/autogluon/cloud/utils/aws_utils.py @@ -12,6 +12,22 @@ logger = logging.getLogger(__name__) +def _resolve_sagemaker_region() -> Optional[str]: + """Return the SageMaker region persisted in ``~/.autogluon/cloud.yaml``, or ``None`` if no + config / no region is set. ``None`` lets the boto3 default chain (env vars, shared config) + take over.""" + from ..backend.constant import SAGEMAKER + + config = load_config() + if config is None: + return None + entry = config.backends.get(SAGEMAKER) + if entry is None or not entry.region: + return None + logger.info(f"Using region from ~/.autogluon/cloud.yaml: {entry.region}") + return entry.region + + def resolve_execution_role(role: Optional[str], backend_name: str) -> str: """Resolve the SageMaker execution role ARN. @@ -108,6 +124,7 @@ def get_latest_amazon_linux_ami(region="us-east-1", version="al2023"): def setup_sagemaker_session( + boto_session: Optional[boto3.Session] = None, config: Optional[Config] = None, connect_timeout: int = 60, read_timeout: int = 60, @@ -117,8 +134,15 @@ def setup_sagemaker_session( """ Setup a sagemaker session with a given configuration + Region resolution (only when ``boto_session`` is not provided): read from + ``~/.autogluon/cloud.yaml`` if set, otherwise fall back to the boto3 default chain (env vars, + shared config). Raises if no region can be resolved at all. + Parameters ---------- + boto_session + Pre-built ``boto3.Session`` to wrap. If provided, region resolution is skipped and the + session is used as-is. config A botocore.Config object providing the intended configuration https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html @@ -148,5 +172,13 @@ def setup_sagemaker_session( if retries is None: retries = {"max_attempts": 20} config = Config(connect_timeout=connect_timeout, read_timeout=read_timeout, retries=retries, **kwargs) - sm_boto = boto3.client("sagemaker", config=config) - return sagemaker.Session(sagemaker_client=sm_boto) + if boto_session is None: + boto_session = boto3.Session(region_name=_resolve_sagemaker_region()) + if boto_session.region_name is None: + raise ValueError( + "AWS region could not be resolved. Set it in `~/.autogluon/cloud.yaml` (e.g. via " + "`autogluon-cloud register --region `), set the `AWS_DEFAULT_REGION` env var, " + "or configure a default region in `~/.aws/config`." + ) + sm_boto = boto_session.client("sagemaker", config=config) + return sagemaker.Session(boto_session=boto_session, sagemaker_client=sm_boto) diff --git a/tests/unittests/general/test_cloud_setup.py b/tests/unittests/general/test_cloud_setup.py index 3d854657..305325f5 100644 --- a/tests/unittests/general/test_cloud_setup.py +++ b/tests/unittests/general/test_cloud_setup.py @@ -1,8 +1,10 @@ """Tests for the ``autogluon.cloud`` setup API.""" import logging +from unittest.mock import MagicMock import pytest +from botocore.exceptions import ClientError from autogluon.cloud import bootstrap, register, status, teardown from autogluon.cloud.config import ( @@ -111,6 +113,72 @@ def test_register_normalizes_bucket(bucket): assert load_config().backends["sagemaker"].bucket == "my-bucket" +def test_register_rejects_bucket_in_other_region(): + session = MagicMock() + session.client.return_value.head_bucket.return_value = { + "ResponseMetadata": {"HTTPHeaders": {"x-amz-bucket-region": "ap-southeast-1"}} + } + with pytest.raises(ValueError, match="ap-southeast-1"): + register( + role="arn:aws:iam::111122223333:role/x", + bucket="b1", + region="us-east-1", + session=session, + ) + + +def test_register_accepts_bucket_in_matching_region(): + session = MagicMock() + session.client.return_value.head_bucket.return_value = { + "ResponseMetadata": {"HTTPHeaders": {"x-amz-bucket-region": "us-east-1"}} + } + register( + role="arn:aws:iam::111122223333:role/x", + bucket="b1", + region="us-east-1", + session=session, + ) + assert load_config().backends["sagemaker"].bucket == "b1" + + +def test_register_rejects_bucket_when_head_bucket_403_carries_region_header(): + """Cross-region head_bucket returns 403 even without s3:HeadBucket perm, but the response + still carries the bucket region header. We must read it from the error path, otherwise the + mismatch slips through whenever the caller's role is locked down.""" + session = MagicMock() + session.client.return_value.head_bucket.side_effect = ClientError( + error_response={ + "Error": {"Code": "403", "Message": "Forbidden"}, + "ResponseMetadata": {"HTTPHeaders": {"x-amz-bucket-region": "ap-southeast-1"}}, + }, + operation_name="HeadBucket", + ) + with pytest.raises(ValueError, match="ap-southeast-1"): + register( + role="arn:aws:iam::111122223333:role/x", + bucket="b1", + region="us-east-1", + session=session, + ) + + +def test_register_skips_when_head_bucket_error_has_no_region_header(): + """If neither the success path nor the error response carries the bucket region header, + we silently skip validation rather than block legitimate setup attempts.""" + session = MagicMock() + session.client.return_value.head_bucket.side_effect = ClientError( + error_response={"Error": {"Code": "404", "Message": "Not Found"}, "ResponseMetadata": {}}, + operation_name="HeadBucket", + ) + register( + role="arn:aws:iam::111122223333:role/x", + bucket="b1", + region="us-east-1", + session=session, + ) + assert load_config().backends["sagemaker"].bucket == "b1" + + def test_register_records_stack_name_when_given(): register( role="arn:aws:iam::111122223333:role/x", @@ -141,6 +209,7 @@ def client(self, service): "autogluon.cloud.cloud_setup._provision_stack", lambda session, stack_name, backend: ("arn:aws:iam::123:role/r", "ag-cloud-bucket"), ) + monkeypatch.setattr("autogluon.cloud.cloud_setup._validate_bucket_region", lambda **kw: None) with caplog.at_level("INFO", logger="autogluon.cloud.cloud_setup"): bootstrap(backend="sagemaker", stack_name="my-stack") @@ -165,6 +234,7 @@ def test_bootstrap_returns_none(monkeypatch): "autogluon.cloud.cloud_setup._provision_stack", lambda session, stack_name, backend: ("arn:...", "b"), ) + monkeypatch.setattr("autogluon.cloud.cloud_setup._validate_bucket_region", lambda **kw: None) assert bootstrap() is None @@ -188,6 +258,7 @@ def fake_provision(session, stack_name, backend): lambda s: (FakeSession(), "123456789012"), ) monkeypatch.setattr("autogluon.cloud.cloud_setup._provision_stack", fake_provision) + monkeypatch.setattr("autogluon.cloud.cloud_setup._validate_bucket_region", lambda **kw: None) bootstrap(backend="ray_aws") assert captured["stack_name"] == "ag-cloud-ray-aws"