Add support for High Availability VPC #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
High Availability Architecture: Introduces an Nginx load balancer and dynamic Docker Compose logic to support running multiple VPC API servers in parallel (toggled via DSTACK_VPC_HA_MODE).
Database Persistence & S3 Replication: Integrates Litestream to replicate the Headscale SQLite database to S3 in real-time and implements automated backup/restore for the noise_private.key to preserve server identity across redeployments.
Self-Healing Client Nodes: Upgrades node scripts with intelligent retry logic, connection timeouts, and an auto-recovery mechanism that automatically triggers re-registration if VPN authentication fails.
Operational Stability: Implements container restart limits to prevent infinite crash loops, updates health checks to support the new load balancer, and bumps core dependencies (Docker, OpenSSL) to the latest versions.
https://www.notion.so/jasnahcom/DStack-VPC-High-Availability-Architecture-Changes-2e229a6526bf80a38ea9e5aaef7cbd1a
Testing
Test Environment
Scenario 1: Initial Deployment & Backup
Scenario 2: VPC Server Redeploy (Fresh CVM)
Scenario 3: New Node After Restore