Problem
Bundle commands (tkdbr bundle validate/deploy/run) need to execute the Databricks CLI in the caller's current working directory (where databricks.yml lives). The daemon runs as _toolkit, which cannot traverse the user's home directory (drwxr-x---), so chdir() to the user's project path fails with Permission denied.
Two workarounds exist and both have drawbacks:
- Broad home ACL (
chmod +a "_toolkit allow search" ~/): gives _toolkit (and every subprocess it spawns, including third-party CLIs) read access to world-readable files across the entire home directory. Undermines the isolation guarantee.
- Local execution via
BundleContext (current v0.2.4): the daemon serialises credentials (DATABRICKS_TOKEN, etc.) into a JSON response that the agent-UID process receives. Credentials briefly cross the UID boundary — a regression against the core guarantee that agents never see credentials.
Status (2026-05-09)
Sticking with the BundleContext local-execution approach for now. The short-lived scoped token alternative below (Option 1) is the preferred next step before — or instead of — implementing SCM_RIGHTS.
Proposed solution: SCM_RIGHTS file descriptor passing
The client opens the target directory and passes the open file descriptor to the daemon alongside the request, using POSIX ancillary data (SCM_RIGHTS) on the Unix socket. The daemon calls fchdir(fd) — no path traversal required — and spawns the subprocess which inherits the working directory.
Client (agent UID) Daemon (_toolkit UID)
────────────────── ─────────────────────
open("/Users/.../fnz-data-migration")
→ dirfd
send_request(json, ancillary: dirfd) ──► recv_request()
fchdir(dirfd)
spawn("databricks bundle validate")
↳ cwd = /Users/.../fnz-data-migration
↳ DATABRICKS_TOKEN injected by daemon
Why this is the right approach:
- No ACL changes required — the daemon never needs traverse rights on the user's home
- Credentials stay daemon-side —
BundleContext with token never sent to agent process
- The client proves it has access to the directory by handing over an open fd — classic capability-based security
- Scoped: the daemon gets access to exactly one directory for exactly one invocation
Implementation notes
-
Wire protocol change: the current protocol is newline-delimited JSON over a plain Unix stream socket. SCM_RIGHTS requires sending ancillary data alongside the normal data in a single sendmsg()/recvmsg() call. Options:
- Extend the protocol so bundle ops send one message with both JSON payload and ancillary fd
- Or add a dedicated pre-flight message: client sends fd first, daemon acks, then normal JSON request follows
-
Tokio support: Tokio's UnixStream exposes try_io() for raw fd operations. Sending ancillary data requires dropping to sendmsg via nix or rustix crates. See tokio-passfd or sendfd crates for prior art.
-
Daemon side: after recvmsg(), extract the fd from the cmsg ancillary buffer, validate it is a directory (fstat(fd).is_dir()), call fchdir(fd), close fd, then spawn the subprocess.
-
Fallback: if no fd is passed (older client), daemon falls back to current behaviour (fails with a clear error if cwd is needed).
-
Scope: only bundle ops need this. Other ops (query, jobs, etc.) have no cwd dependency.
Implementation traps (do not understate these)
-
Daemon multithreading vs. fchdir. fchdir(2) mutates process cwd on most systems, so two concurrent bundle requests in the tokio runtime will race. Don't call fchdir on the daemon thread — use posix_spawn with posix_spawn_file_actions_addfchdir_np (Linux glibc; macOS 10.15+) so cwd is set atomically per-spawn.
-
Artifact ownership. databricks bundle deploy writes into .databricks/, terraform state, wheel/build outputs in the cwd. Under SCM_RIGHTS those files end up owned by _toolkit. The user can read them but can't rm -rf .databricks/ to recover from a corrupt cache without sudo. Real ergonomic regression that will surface in normal use.
-
Out-of-cwd file references. Bundles often reference ../shared/..., notebook paths in sibling repos, etc. With cwd granted but no traversal rights up the tree, those reads fail. Would need protocol extended to multiple dirfds or accept the limitation.
-
Tokio + ancillary data. sendfd/passfd work, but require breaking out of AsyncRead/AsyncWrite for a single recvmsg call and re-entering — the existing newline-framed read loop has to be split. More intrusive than it sounds.
-
CLOEXEC on the dirfd before spawning so the child doesn't inherit the directory handle.
Alternative solutions
Option 1: Short-lived scoped token (preferred next step)
Keep the current BundleContext shape, but the daemon mints a time-limited, scope-narrowed Databricks token instead of returning the long-term one. Databricks supports OAuth M2M / token-create with a TTL; hand the agent a ~10-minute token scoped to the workspace.
- Credential still crosses the UID boundary (worse than SCM_RIGHTS on paper).
- But blast radius is bounded by TTL + scope — the standard industry mitigation (AWS STS, GCP impersonation).
- Zero protocol surgery, no tokio gymnastics, artifacts stay user-owned, no home-dir ACL.
- Works for any future tool with the same cwd problem.
Option 2: SCM_RIGHTS + privilege downgrade in spawned child
After posix_spawn with the dirfd, have the spawned databricks run as the caller's UID (known from getpeereid), not _toolkit. Solves artifact ownership cleanly, but requires the daemon to setuid to arbitrary UIDs — i.e. running as root or shipping a setuid helper. Significant added attack surface; probably not worth it.
Option 3: Hybrid — SCM_RIGHTS for cwd + scoped token for credential
Belt-and-braces. Eliminates both home-traversal need and limits credential exposure if SCM_RIGHTS is bypassed. Best end state, most implementation work.
Security properties
| Property |
Broad ACL |
BundleContext (v0.2.4) |
SCM_RIGHTS |
Scoped token |
| Long-term credentials stay daemon-side |
✅ |
✅ |
✅ |
✅ |
| No transient credential in agent UID |
✅ |
❌ |
✅ |
⚠️ bounded by TTL |
| No broad home dir access |
❌ |
✅ |
✅ |
✅ |
| Scoped to specific directory |
❌ |
✅ |
✅ |
✅ |
| Artifacts owned by user |
✅ |
✅ |
❌ |
✅ |
| Works on macOS + Linux |
✅ |
✅ |
✅ |
✅ |
| Implementation cost |
low |
low |
high |
low |
References
Problem
Bundle commands (
tkdbr bundle validate/deploy/run) need to execute the Databricks CLI in the caller's current working directory (wheredatabricks.ymllives). The daemon runs as_toolkit, which cannot traverse the user's home directory (drwxr-x---), sochdir()to the user's project path fails withPermission denied.Two workarounds exist and both have drawbacks:
chmod +a "_toolkit allow search" ~/): gives_toolkit(and every subprocess it spawns, including third-party CLIs) read access to world-readable files across the entire home directory. Undermines the isolation guarantee.BundleContext(current v0.2.4): the daemon serialises credentials (DATABRICKS_TOKEN, etc.) into a JSON response that the agent-UID process receives. Credentials briefly cross the UID boundary — a regression against the core guarantee that agents never see credentials.Status (2026-05-09)
Sticking with the
BundleContextlocal-execution approach for now. The short-lived scoped token alternative below (Option 1) is the preferred next step before — or instead of — implementing SCM_RIGHTS.Proposed solution: SCM_RIGHTS file descriptor passing
The client opens the target directory and passes the open file descriptor to the daemon alongside the request, using POSIX ancillary data (
SCM_RIGHTS) on the Unix socket. The daemon callsfchdir(fd)— no path traversal required — and spawns the subprocess which inherits the working directory.Why this is the right approach:
BundleContextwith token never sent to agent processImplementation notes
Wire protocol change: the current protocol is newline-delimited JSON over a plain Unix stream socket. SCM_RIGHTS requires sending ancillary data alongside the normal data in a single
sendmsg()/recvmsg()call. Options:Tokio support: Tokio's
UnixStreamexposestry_io()for raw fd operations. Sending ancillary data requires dropping tosendmsgvianixorrustixcrates. Seetokio-passfdorsendfdcrates for prior art.Daemon side: after
recvmsg(), extract the fd from thecmsgancillary buffer, validate it is a directory (fstat(fd).is_dir()), callfchdir(fd), close fd, then spawn the subprocess.Fallback: if no fd is passed (older client), daemon falls back to current behaviour (fails with a clear error if cwd is needed).
Scope: only bundle ops need this. Other ops (query, jobs, etc.) have no cwd dependency.
Implementation traps (do not understate these)
Daemon multithreading vs.
fchdir.fchdir(2)mutates process cwd on most systems, so two concurrent bundle requests in the tokio runtime will race. Don't callfchdiron the daemon thread — useposix_spawnwithposix_spawn_file_actions_addfchdir_np(Linux glibc; macOS 10.15+) so cwd is set atomically per-spawn.Artifact ownership.
databricks bundle deploywrites into.databricks/, terraform state, wheel/build outputs in the cwd. Under SCM_RIGHTS those files end up owned by_toolkit. The user can read them but can'trm -rf .databricks/to recover from a corrupt cache without sudo. Real ergonomic regression that will surface in normal use.Out-of-cwd file references. Bundles often reference
../shared/..., notebook paths in sibling repos, etc. With cwd granted but no traversal rights up the tree, those reads fail. Would need protocol extended to multiple dirfds or accept the limitation.Tokio + ancillary data.
sendfd/passfdwork, but require breaking out ofAsyncRead/AsyncWritefor a singlerecvmsgcall and re-entering — the existing newline-framed read loop has to be split. More intrusive than it sounds.CLOEXEC on the dirfd before spawning so the child doesn't inherit the directory handle.
Alternative solutions
Option 1: Short-lived scoped token (preferred next step)
Keep the current
BundleContextshape, but the daemon mints a time-limited, scope-narrowed Databricks token instead of returning the long-term one. Databricks supports OAuth M2M / token-create with a TTL; hand the agent a ~10-minute token scoped to the workspace.Option 2: SCM_RIGHTS + privilege downgrade in spawned child
After
posix_spawnwith the dirfd, have the spawneddatabricksrun as the caller's UID (known fromgetpeereid), not_toolkit. Solves artifact ownership cleanly, but requires the daemon tosetuidto arbitrary UIDs — i.e. running as root or shipping a setuid helper. Significant added attack surface; probably not worth it.Option 3: Hybrid — SCM_RIGHTS for cwd + scoped token for credential
Belt-and-braces. Eliminates both home-traversal need and limits credential exposure if SCM_RIGHTS is bypassed. Best end state, most implementation work.
Security properties
References
sendmsg()/SCM_RIGHTS: https://man7.org/linux/man-pages/man3/cmsg.3.htmlsendfdcrate: https://crates.io/crates/sendfdrustixcrate (low-level unix): https://crates.io/crates/rustix