Skip to content

Support attaching SQLite files residing on S3#198

Closed
staticlibs wants to merge 4 commits into
duckdb:mainfrom
staticlibs:s3_attach
Closed

Support attaching SQLite files residing on S3#198
staticlibs wants to merge 4 commits into
duckdb:mainfrom
staticlibs:s3_attach

Conversation

@staticlibs

Copy link
Copy Markdown
Member

When DuckDB file on S3 is attached as read-only, the remote file system instance (S3FileSystem from httpfs extension) is used to perform the remote scans over that file. This does not work for SQLite files, as the file path is passed to sqlite3_open_v2 that doesn't know about remote filesystems.

As a remote ATTACH is supposed to be read-only, a possible workaround is to download the whole .sqlite file locally and then attach it as readonly.

This PR implements the auto-download for remote SQLite files (https://, s3://, abfss:// etc) into a system temp directory and open the file from there (file is deleted on DETACH). It is intended to be used with the files of reasonable size in scenarios, when manual download is not possible/convenient. For example: opening a remote DuckLake catalog (residing on S3) from a BI tool.

Testing: new test is added that opens an SQLite file from GitHub; also tested manualy DuckLake catalogs on s3:// and abfss://.

Fixes: duckdb/ducklake#912

When DuckDB file on S3 is attached as read-only, the remote file system
instance (`S3FileSystem` from `httpfs` extension) is used to perform the
remote scans over that file. This does not work for SQLite files, as the
file path is passed to `sqlite3_open_v2` that doesn't know about remote
filesystems.

As a remote `ATTACH` is supposed to be read-only, a possible workaround
is to download the whole `.sqlite` file locally and then attach it as
readonly.

This PR implements the auto-download for remote SQLite files (`https://`,
`s3://`, `abfss://` etc) into a system temp directory and open the file
from there (file is deleted on `DETACH`). It is intended to be used with
the files of reasonable size in scenarios, when manual download is not
possible/convenient. For example: opening a remote DuckLake catalog
(residing on S3) from a BI tool.

Testing: new test is added that opens an SQLite file from GitHub; also
tested manualy DuckLake catalogs on `s3://` and `abfss://`.

Fixes: duckdb/ducklake#912
@taniabogatsch taniabogatsch requested a review from Mytherin June 22, 2026 10:55
@Mytherin

Copy link
Copy Markdown
Contributor

Thanks - getting this to work is cool but I think the vfs direction seems more promising for this. See #66 - maybe that can be picked up again / reworked to get this to work? That has a bunch of other nice outcomes (e.g. making this extension work in WASM as well).

@staticlibs staticlibs marked this pull request as draft June 22, 2026 19:12
@staticlibs

Copy link
Copy Markdown
Member Author

VFS indeed will be much nicer than this workaround, closing this one in favour of it.

@staticlibs staticlibs closed this Jun 23, 2026
@ak2k

ak2k commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

I think the vfs direction seems more promising for this. See #66 - maybe that can be picked up again / reworked to get this to work? That has a bunch of other nice outcomes (e.g. making this extension work in WASM as well).

Just wanted to mention that there's what I believe is now a complete VFS implementation in #154

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

\"Unsupported parameter for SQLite Attach: storage_version\"}

3 participants