Bug Description
After restarting a standalone Docker deployment of Hindsight, the container prints this warning on startup:
⚠️ WARNING: pg0 data directory exists at /home/hindsight/.pg0 but no PG_VERSION found.
This may indicate data corruption or an incomplete previous shutdown.
If you see all migrations running from scratch after this, your data may have been lost.
See: https://github.com/vectorize-io/hindsight/issues/675
However, in this environment the warning appears to be a false positive.
The actual embedded pg0 PostgreSQL data directory is not /home/hindsight/.pg0 itself. It is:
/home/hindsight/.pg0/instances/hindsight/data
That real data directory does contain a valid PostgreSQL layout, including:
PG_VERSION
base/
global/
pg_wal/
postgresql.conf
Hindsight then starts successfully, the database connects, migrations complete normally, /health returns healthy, and existing API data remains accessible.
So the startup warning currently implies possible corruption/data loss even when the embedded database is healthy and the real PG_VERSION file exists deeper under the nested pg0 instance layout.
This seems to come from the startup integrity check being too shallow for pg0's current directory structure.
Additional Context
Environment:
- Host: macOS 15.7.7
- Runtime: OrbStack / Docker Compose
- Hindsight mode: standalone container with embedded pg0
- API port: 8888
- Control Plane port: 9999
Relevant local bind mount structure on the host:
/Users/jagaliano/.config/appdata/hindsight/
installation/
instances/
hindsight/
instance.json
data/
PG_VERSION
base/
global/
pg_wal/
...
instance.json confirms the real data directory:
{
"data_dir": "/home/hindsight/.pg0/instances/hindsight/data"
}
The current startup script in the container appears to check too shallowly. In /app/start-all.sh:
PG0_DATA_DIR="${HOME}/.pg0"
if [ -d "$PG0_DATA_DIR" ]; then
# Look for actual PostgreSQL data directories (pg0 creates subdirs per instance)
if compgen -G "$PG0_DATA_DIR"/*/PG_VERSION > /dev/null 2>&1; then
echo "✅ Existing pg0 data directory detected at $PG0_DATA_DIR"
elif [ "$(ls -A "$PG0_DATA_DIR" 2>/dev/null)" ]; then
echo "⚠️ WARNING: pg0 data directory exists at $PG0_DATA_DIR but no PG_VERSION found."
...
fi
fi
In this deployment the real PG_VERSION is under:
$HOME/.pg0/instances/hindsight/data/PG_VERSION
So the current glob:
$PG0_DATA_DIR/*/PG_VERSION
misses the valid nested data directory and triggers a false-positive warning.
Related issues / context:
This report is narrower: the warning appears even when the database is healthy and the real nested pg0 data directory contains PG_VERSION.
Possible Fix
Update the startup integrity check to detect the actual nested pg0 instance layout, for example by checking one of these patterns:
$HOME/.pg0/instances/*/data/PG_VERSION
or by reading instance.json first and validating the configured data_dir directly.
That would avoid warning on healthy embedded pg0 deployments that use the current nested instance structure.
Steps to Reproduce
- Run Hindsight standalone in Docker with embedded pg0 and a bind mount at:
-v /path/on/host:/home/hindsight/.pg0
- Allow Hindsight/pg0 to initialize normally so the actual PostgreSQL data directory is created under a nested instance path such as:
/home/hindsight/.pg0/instances/hindsight/data
-
Confirm the real data directory contains PG_VERSION.
-
Restart the container.
-
Observe startup logs.
-
Hindsight prints the warning saying /home/hindsight/.pg0 has no PG_VERSION.
-
Despite that warning, verify that:
- PostgreSQL starts successfully
- Hindsight reports database connected
/health returns healthy
- existing banks/operations remain accessible
Expected Behavior
If the actual embedded pg0 PostgreSQL data directory exists and contains PG_VERSION, Hindsight should not emit a corruption/data-loss warning.
The integrity check should follow the real pg0 instance layout and detect valid nested data directories such as:
/home/hindsight/.pg0/instances/<instance-name>/data/PG_VERSION
At minimum, the warning should not be shown when the database is demonstrably valid and starts normally.
Actual Behavior
Hindsight emits the corruption/data-loss warning based on the top-level mount path:
Even though the actual data directory is nested deeper and already contains PG_VERSION, the warning is still printed.
In this environment the subsequent startup succeeds normally:
- PostgreSQL starts
- migrations complete
- API becomes healthy
- existing operation history remains readable
This makes the warning misleading and may cause operators to believe the volume or database is corrupted when it is not.
Version
Hindsight v0.6.2
LLM Provider
Other
Bug Description
After restarting a standalone Docker deployment of Hindsight, the container prints this warning on startup:
However, in this environment the warning appears to be a false positive.
The actual embedded pg0 PostgreSQL data directory is not
/home/hindsight/.pg0itself. It is:That real data directory does contain a valid PostgreSQL layout, including:
PG_VERSIONbase/global/pg_wal/postgresql.confHindsight then starts successfully, the database connects, migrations complete normally,
/healthreturns healthy, and existing API data remains accessible.So the startup warning currently implies possible corruption/data loss even when the embedded database is healthy and the real
PG_VERSIONfile exists deeper under the nested pg0 instance layout.This seems to come from the startup integrity check being too shallow for pg0's current directory structure.
Additional Context
Environment:
Relevant local bind mount structure on the host:
instance.jsonconfirms the real data directory:{ "data_dir": "/home/hindsight/.pg0/instances/hindsight/data" }The current startup script in the container appears to check too shallowly. In
/app/start-all.sh:In this deployment the real
PG_VERSIONis under:So the current glob:
misses the valid nested data directory and triggers a false-positive warning.
Related issues / context:
This report is narrower: the warning appears even when the database is healthy and the real nested pg0 data directory contains
PG_VERSION.Possible Fix
Update the startup integrity check to detect the actual nested pg0 instance layout, for example by checking one of these patterns:
or by reading
instance.jsonfirst and validating the configureddata_dirdirectly.That would avoid warning on healthy embedded pg0 deployments that use the current nested instance structure.
Steps to Reproduce
Confirm the real data directory contains
PG_VERSION.Restart the container.
Observe startup logs.
Hindsight prints the warning saying
/home/hindsight/.pg0has noPG_VERSION.Despite that warning, verify that:
/healthreturns healthyExpected Behavior
If the actual embedded pg0 PostgreSQL data directory exists and contains
PG_VERSION, Hindsight should not emit a corruption/data-loss warning.The integrity check should follow the real pg0 instance layout and detect valid nested data directories such as:
At minimum, the warning should not be shown when the database is demonstrably valid and starts normally.
Actual Behavior
Hindsight emits the corruption/data-loss warning based on the top-level mount path:
Even though the actual data directory is nested deeper and already contains
PG_VERSION, the warning is still printed.In this environment the subsequent startup succeeds normally:
This makes the warning misleading and may cause operators to believe the volume or database is corrupted when it is not.
Version
Hindsight v0.6.2
LLM Provider
Other