Lightweight SSH orchestration for multi-server environments.
- Individual server access via named symlinks (
srv1,dev,backup) - Cluster-wide commands with sequential or parallel execution
- Configuration-driven server management via
hosts.conf - Host connectivity testing with reachability checks
- Ad-hoc launchers without config file editing
- Local-only restrictions for host-specific access control
- Relay failover on SSH connection failure via ProxyCommand (nc)
# 1. Install
git clone https://github.com/OkusiAssociates/oknav.git
cd oknav && sudo make install
# 2. Configure servers
sudo nano /etc/oknav/hosts.conf
# server1.example.com srv1 (oknav)
# server2.example.com srv2 (oknav)
# 3. Create symlinks
sudo oknav install
# 4. Use
srv1 uptime # Individual server
oknav uptime # All cluster servers
oknav -p df -h # Parallel executiongit clone https://github.com/OkusiAssociates/oknav.git
cd oknav
sudo make installInstalls to:
| Location | Contents |
|---|---|
/usr/local/share/oknav/ |
Package files |
/usr/local/bin/ |
Executable symlinks |
/etc/oknav/hosts.conf |
Server configuration |
/usr/local/share/man/man1/oknav.1 |
Manual page |
/etc/bash_completion.d/oknav |
Bash completion |
sudo make uninstallConfiguration in /etc/oknav/ is preserved on uninstall. Remove manually if no longer needed.
Server definitions live in /etc/oknav/hosts.conf:
# Format: FQDN primary-alias [alias2...] [(options)]
# Production cluster
server1.example.com srv1 server1 (oknav)
server2.example.com srv2 server2 (oknav)
server3.example.com srv3 server3 (oknav)
# Development (restricted to workstation)
devbox.local dev devbox (oknav,local-only:workstation)
# Backup (excluded from normal cluster operations)
backup.local bak backup (oknav,exclude)
# Direct access only (not in cluster)
adhoc.example.com adhoc
| Option | Description |
|---|---|
oknav |
Include in cluster operations (primary alias only) |
exclude |
Exclude from cluster operations (still accessible directly) |
local-only:HOST |
Restrict access to specified hostname |
- Primary alias: First alias listed is used for cluster operations
- Combined options: Separate with commas:
(oknav,local-only:myhost) - Ad-hoc entries: Servers without
(oknav)are accessible but not in cluster - Config priority:
$OKNAV_HOSTS_CONF>/etc/oknav/hosts.conf>$SCRIPT_DIR/hosts.conf
Connect to servers via symlinks that resolve through hosts.conf:
srv1 [OPTIONS] [command]| Option | Description |
|---|---|
-r, --root |
Connect as root |
-u, --user USER |
Connect as specified user |
-c, --connect-timeout SECS |
SSH connect timeout (default: 10) |
-d, --dir |
Preserve current working directory |
-R, --relay HOST |
Override relay host for this invocation |
--no-relay |
Disable relay failover for this invocation |
-D, --debug |
Show connection parameters |
-V, --version |
Show version |
-h, --help |
Show help |
Examples:
srv1 # Interactive shell
srv1 uptime # Execute command
srv1 -r # Root shell
srv1 -rd # Root shell, current directory
srv1 -u deploy git pull # Run as specific user
srv1 "df -h | grep data" # Complex commands (quote them)
srv1 --relay relay-host uptime # Override relay host (used on failure)
srv1 --no-relay uptime # Disable relay failoverWhen an SSH connection fails (exit code 255), ok_master can automatically retry through a relay host using ProxyCommand (nc). This is opt-in and machine-specific.
Resolution priority (highest first):
| Priority | Source | Description |
|---|---|---|
| 1 | --no-relay flag |
Disable relay for this invocation |
| 2 | --relay HOST / -R HOST flag |
Override relay for this invocation |
| 3 | OKNAV_RELAY env var |
Override relay (set to none to disable) |
| 4 | /etc/oknav/relay.conf |
Persistent machine-specific default |
| 5 | No relay configured | Direct connection only (original behavior) |
relay.conf format: First non-blank, non-comment line; first word is used as the SSH target (hostname, user@host, or SSH config Host name).
Behavior: When no relay is configured, ok_master uses exec ssh with zero overhead (original behavior). When a relay is configured, it tries a direct connection first and only relays on exit code 255. Other exit codes pass through unchanged.
Examples:
srv1 --relay jump-host uptime # Override relay host (used on failure)
srv1 --no-relay uptime # Disable relay for this invocation
OKNAV_RELAY=none srv1 uptime # Disable relay via env
srv1 -D uptime # Debug output shows relay statusExecute commands across all servers marked with (oknav):
oknav [OPTIONS] <command>
oknav [OPTIONS] -- <command> # Force command mode| Option | Description |
|---|---|
-p, --parallel |
Execute simultaneously |
-d, --dir |
Preserve current working directory |
-c, --connect-timeout SECS |
SSH connect timeout (default: 10) |
-t, --timeout SECS |
Per-server execution timeout (default: 120) |
-x, --exclude-host HOST |
Exclude server (repeatable) |
-D, --debug |
Show server discovery details |
-- |
Force command mode (bypass subcommand detection) |
-V, --version |
Show version |
-h, --help |
Show help |
Examples:
oknav uptime # Sequential execution
oknav -p df -h # Parallel execution
oknav -d ls # Run in current directory
oknav -pd pwd # Parallel, preserve directory
oknav -pt 10 uptime # Parallel + 10s timeout
oknav -x srv1 -x srv2 uptime # Exclude multiple servers
oknav -D hostname # Debug: show discovered servers
oknav -- list /tmp # Run 'list /tmp' (not subcommand)| Subcommand | Purpose |
|---|---|
install |
Manage symlinks from hosts.conf |
add |
Create ad-hoc launcher |
remove |
Remove launcher |
list |
List installed host symlinks |
help |
Show usage help |
Create symlinks in /usr/local/bin for all hosts.conf aliases:
sudo oknav install # Create/update symlinks
oknav install --dry-run # Preview changes
sudo oknav install --remove-stale # Remove ok*-prefixed symlinks not in hosts.conf
sudo oknav install --clean-local # Remove ok*-prefixed dev symlinks from script dir| Option | Description |
|---|---|
-n, --dry-run |
Preview changes |
--remove-stale |
Remove ok*-prefixed symlinks not in hosts.conf |
--clean-local |
Remove ok*-prefixed symlinks from script directory |
-h, --help |
Show help |
Note: Aliases that conflict with subcommand names (install, add, remove, list, help) trigger a warning during oknav install. Use oknav -- <alias> <cmd> to execute commands on such servers.
Create ad-hoc launchers without editing hosts.conf:
# Add launcher (hostname used as alias)
sudo oknav add ai.okusi.id
# Add with custom aliases
sudo oknav add ai.okusi.id ai ok-ai
# Remove launchers
sudo oknav remove ai ok-ai| Option | Description |
|---|---|
-n, --dry-run |
Preview changes |
-h, --help |
Show help |
Notes:
- Hostname must be resolvable (DNS or
/etc/hosts) - Ad-hoc hosts are NOT included in cluster operations
- Use
hosts.conffor cluster membership
Show all host symlinks in /usr/local/bin pointing to ok_master:
oknav list # List all hosts
oknav list -R # Test SSH connectivity
oknav list -R -p # Parallel connectivity testing| Option | Description |
|---|---|
-R, --reachable |
Test SSH connectivity (3s connect, 10s total per host) |
-p, --parallel |
Test hosts in parallel (use with -R) |
-h, --help |
Show help |
Output format:
srv1 hosts.conf
srv2 hosts.conf
ai ad-hoc
With -R (reachability testing):
srv1 hosts.conf ✓
srv2 hosts.conf ✓
ai ad-hoc ✗
Direct output without prefixes:
$ srv1 uptime
09:15:23 up 45 days, 3:22, 2 users, load average: 0.15, 0.12, 0.10
Prefixed output showing server origin:
$ oknav uptime
+++srv1:
09:15:23 up 45 days, 3:22, 2 users, load average: 0.15, 0.12, 0.10
+++srv2:
09:15:24 up 32 days, 14:55, 0 users, load average: 0.08, 0.03, 0.01
In parallel mode (-p), output is collected and displayed in server order after all complete.
make lint# Run all tests
make test
# Run specific test file
bats tests/oknav.bats
# Filter by test name
bats tests/oknav.bats --filter "parallel"| Tool | Option | Description |
|---|---|---|
ok_master |
-r |
Root user |
ok_master |
-u USER |
Specific user |
ok_master |
-c SECS |
SSH connect timeout (default: 10) |
ok_master |
-d |
Preserve directory |
ok_master |
-R HOST |
Override relay host |
ok_master |
--no-relay |
Disable relay failover |
ok_master |
-D |
Debug mode |
ok_master |
-V |
Show version |
ok_master |
-h |
Show help |
oknav |
-p |
Parallel execution |
oknav |
-d |
Preserve directory |
oknav |
-c SECS |
SSH connect timeout (default: 10) |
oknav |
-t SECS |
Per-server execution timeout (default: 120) |
oknav |
-x HOST |
Exclude host (repeatable) |
oknav |
-D |
Debug mode |
oknav |
-- |
Force command mode |
oknav |
-V |
Show version |
oknav |
-h |
Show help |
install |
-n |
Dry run |
install |
--remove-stale |
Remove stale ok* symlinks |
install |
--clean-local |
Remove local ok* symlinks |
add |
-n |
Dry run |
remove |
-n |
Dry run |
list |
-R |
Test reachability |
list |
-p |
Parallel testing |
| Code | Tool | Meaning |
|---|---|---|
| 0 | both | Success |
| 1 | both | General error |
| 2 | ok_master | Local-only constraint violated |
| 22 | both | Invalid option (EINVAL) |
| 42 | ok_master | Direct execution (must use symlink) |
Per-server status codes (from timeout(1), shown as output messages by oknav — absorbed, not propagated as exit codes):
| Code | Meaning |
|---|---|
| 124 | Execution timeout reached |
| 125 | Timeout command error |
| 126 | Command not found |
| Variable | Description |
|---|---|
OKNAV_CONNECT_TIMEOUT |
Override default SSH connect timeout (default: 10) |
OKNAV_HOSTS_CONF |
Override hosts.conf location |
OKNAV_RELAY |
Override relay host for ok_master (set to none to disable) |
OKNAV_TARGET_DIR |
Override target directory for oknav list (defaults to /usr/local/bin) |
XDG_RUNTIME_DIR |
Temp file location (falls back to /tmp) |
Installed:
/usr/local/share/oknav/
├── oknav # Cluster orchestrator
├── ok_master # Individual server handler
├── common.inc.sh # Shared functions
└── VERSION # Installed version
/etc/oknav/
├── hosts.conf # Server configuration
└── relay.conf # Relay failover host (optional)
/usr/local/bin/
├── oknav # → /usr/local/share/oknav/oknav
├── ok_master # → /usr/local/share/oknav/ok_master
└── srv1, srv2... # → /usr/local/share/oknav/ok_master
Source:
oknav/
├── oknav # Cluster orchestrator
├── ok_master # Individual server handler
├── common.inc.sh # Shared functions
├── Makefile # BCS1212 installer
├── hosts.conf.example # Configuration template
├── oknav.1 # Man page
├── oknav.bash_completion
└── tests/ # BATS test suite
- Commands execute via sudo with SSH key authentication
- Parallel operations affect multiple servers simultaneously
- Local-only restrictions prevent access from unauthorized hosts
- Test first: Run commands on individual servers before cluster-wide
- Quote carefully: Use quotes for commands with pipes or special characters
- Review changes: Use
-Ddebug mode to verify server discovery - Limit blast radius: Use
-xto exclude servers when testing - Secure keys: Protect SSH keys and rotate regularly
- Restrict access: Use
local-only:HOSTfor sensitive servers
"Command not found"
- Ensure symlinks exist:
ls -la /usr/local/bin/srv* - Run
sudo oknav installto create symlinks
SSH connection failures
- Verify SSH key authentication:
ssh user@server echo OK - Check hostname resolution:
getent hosts server1.example.com - Use
-Dfor debug output (shows relay status) - Try
--no-relayto test direct connection if relay is configured
Timeout errors
- Connect timeout (
-c): controls SSH handshake (default: 10s) - Execution timeout (
-t): per-server SSH call timeout (default: 120s) - Use
-c SECSfor unreachable hosts to fail fast - Use
-t SECSfor long-running commands - In sequential mode, total wall-clock can reach N ×
-t SECS
"No servers found"
- Verify
hosts.confhas entries with(oknav)option - Check
(local-only:HOST)restrictions match current hostname - Use
oknav -Dto see server discovery
Local-only restrictions
- Servers with
(local-only:HOST)only work from that specific host - Check current hostname:
hostname
# Show discovered servers and settings
oknav -D hostname
# Verify individual server resolution
srv1 -D whoami
# Test connectivity with short timeout
oknav -t 5 echo OK
# List hosts with reachability check
oknav list -RGPL-3. See LICENSE file for details.