Skip to content

fix(memory): health-gate Ollama embeddings, fall back to cloud (OPENHUMAN-TAURI-B7)#1555

Open
CodeGhost21 wants to merge 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/ollama-embed-health-gate
Open

fix(memory): health-gate Ollama embeddings, fall back to cloud (OPENHUMAN-TAURI-B7)#1555
CodeGhost21 wants to merge 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/ollama-embed-health-gate

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented May 12, 2026

Summary

  • When a user has opted into local Ollama embeddings (local_ai.runtime_enabled + local_ai.usage.embeddings) but the daemon isn't running at localhost:11434, the memory factory now probes the daemon once at construction time and falls back to the cloud embedder for that session.
  • Replaces the previous per-embed Sentry flood (226 events/day, 0 impacted users for OPENHUMAN-TAURI-B7) with one low-cardinality Sentry event under memory.ollama_health_gate per session — genuine misconfigurations still surface, but the noise stops.
  • New tests cover the probe and the fallback decision; existing behaviour for the cloud-only and healthy-Ollama paths is unchanged.

Problem

Sentry issue OPENHUMAN-TAURI-B7 fires when OllamaEmbedding::embed can't connect to http://localhost:11434. That's an expected state when a user toggled local embeddings in Settings but doesn't actually have Ollama running (forgot to start it, crashed, uninstalled, port conflict, …). Every embed retry generated another Sentry event — 226 in 24 hours with zero real product impact, while the user silently got a broken-feeling app with no signal that local AI failed.

The naïve fix (silence the error) was wrong: it hides the diagnostic from us and from the user. The real fix is to never even attempt the Ollama call when the daemon isn't reachable, and to surface the misconfiguration once — not 226 times.

Solution

In src/openhuman/memory/store/factories.rs:

  1. New probe_ollama_reachable(base_url) -> bool (async): issues a 2-second GET <base_url>/api/tags and returns true iff the daemon responds 2xx. Transport failures, timeouts, and non-2xx all return false.
  2. New effective_embedding_settings_probed(...) (async): wraps the existing effective_embedding_settings with a probe-and-fallback step. Falls back to (cloud, embedding-v1, 1024) when the intended provider is ollama but the probe fails. Calls report_error once with domain=memory operation=ollama_health_gate so Sentry still gets the signal — just at session granularity, not per-embed.
  3. create_memory_full now runs the probe inline via tokio::task::block_in_place + Handle::current().block_on(...). The core always runs under a multi-thread tokio runtime so this is safe; when called with no runtime present (sync unit-test contexts), the probe is skipped and the intended provider is used (preserves pre-gate behaviour for those callers).
  4. Re-exported the new public functions through memory::store::mod and memory::mod.

The probe URL honours OPENHUMAN_OLLAMA_BASE_URL / OLLAMA_HOST env vars the same way the lifecycle's existing ollama_base_url() helper does, so both points always agree on the daemon address.

Submission Checklist

  • Tests added or updated — 7 new tests in memory::store::factories::tests: probe healthy/unreachable/non-2xx; probed-settings cloud passthrough; opted-in healthy keeps Ollama; opted-in unreachable falls back to cloud.
  • Diff coverage ≥ 80% — added paths (probe_ollama_reachable, effective_embedding_settings_probed, fallback branch in create_memory_full) are exercised by the new tests; cloud-only path covered by the existing probed_settings_keep_cloud_when_provider_is_cloud.
  • Coverage matrix updated — N/A: behaviour-only change to existing memory factory path; no feature added/removed/renamed.
  • All affected feature IDs from the matrix are listed under ## Related — N/A.
  • No new external network dependencies introduced — the probe targets the daemon the user explicitly opted into.
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A.
  • Linked issue closed via Closes #NNN — N/A: tracked in Sentry (Fixes OPENHUMAN-TAURI-B7), not GitHub Issues.

Impact

  • Desktop only (memory factory is invoked from the core sidecar).
  • Sentry signal: OPENHUMAN-TAURI-B7 stops firing per-embed; a single memory.ollama_health_gate event per session takes its place. Net reduction: ~226× per affected user/day.
  • User experience: users who toggled local embeddings without a running daemon now get a working app on the cloud embedder instead of silent retrieval failures. They'll still hit local AI for chat/heartbeat if those are opted-in — only the embedder is health-gated here.
  • Performance: one 2-second-bounded HTTP probe at memory-factory time per session (only when Ollama is opted-in). Negligible vs. the cost of the failed embed retries it replaces.
  • Migration / compat: no on-disk changes; no config schema changes; existing call sites unmodified.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

  • Branch: fix/ollama-embed-health-gate
  • Commit SHA: 9aecc99549e19fd1971d9c43c026e0d8a71aa2ff

Validation Run

  • pnpm --filter openhuman-app format:check — passed in pre-push hook
  • pnpm typecheck — passed in pre-push hook
  • Focused tests: cargo test --lib memory::store::factories8 passed, 0 failed; cargo test --lib openhuman::embeddings105 passed, 0 failed
  • Rust fmt/check: cargo fmt --check clean; cargo check --tests clean (only pre-existing unrelated warnings)
  • Tauri fmt/check (if changed): N/A — no Tauri shell files touched

Validation Blocked

  • N/A

Behavior Changes

  • Intended behavior change: when local-AI embeddings are opted-in but the Ollama daemon is unreachable, the embedder transparently falls back to the cloud provider for that session instead of generating per-embed errors.
  • User-visible effect: users with the bad config now have a working memory pipeline. The fallback is logged at warn level locally and Sentry gets exactly one event per session.

Parity Contract

  • Legacy behavior preserved: cloud-only and healthy-Ollama paths are byte-identical to before. Existing effective_embedding_settings is unchanged; the probe lives in the factory call site and a new async sibling.
  • Guard/fallback/dispatch parity checks: the env-var override logic in ollama_base_url_for_probe mirrors local_ai::ollama_api::ollama_base_url (priority: OPENHUMAN_OLLAMA_BASE_URL > OLLAMA_HOST > default).

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • New Features

    • Added a reachability probe for the local AI (Ollama) service during startup.
  • Improvements

    • Automatic fallback to cloud embeddings when local AI is unreachable to preserve continuity.
    • Reduced noisy health reporting by gating duplicate error reports.
  • Tests

    • Added tests simulating local AI reachable/unreachable scenarios and host-redaction checks.

Review Change Stack

…UMAN-TAURI-B7)

When a user has opted into local AI for embeddings
(`local_ai.runtime_enabled = true` + `local_ai.usage.embeddings = true`)
but the Ollama daemon isn't actually running at `localhost:11434`, the
previous code instantiated `OllamaEmbedding` anyway and every embed
call fired a Sentry event: 226 in a single day with zero impacted
users — pure noise drowning real signals.

Root-cause fix: probe `<base_url>/api/tags` once at memory-factory
time (2s timeout). If Ollama responds, behaviour is unchanged. If it
doesn't, fall back to the cloud embedder for this session and report
**one** Sentry event with low cardinality (`memory.ollama_health_gate`)
so a genuine misconfiguration still surfaces while embed calls never
even reach the dead daemon.

The probe is driven through `tokio::task::block_in_place +
Handle::current().block_on(...)` from the sync factory, which is safe
under the core's multi-thread runtime and harmless in sync test
contexts (where no runtime is present → probe is skipped).

New public surface:
- `memory::probe_ollama_reachable(base_url) -> bool` (async)
- `memory::effective_embedding_settings_probed(...)` (async)

8 new unit tests cover the probe (healthy/unreachable/non-2xx) and
the fallback decision (cloud-only paths untouched, opted-in healthy
keeps Ollama, opted-in unreachable falls back to cloud).

Fixes OPENHUMAN-TAURI-B7
@CodeGhost21 CodeGhost21 requested a review from a team May 12, 2026 12:51
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4f22c771-a7f2-4c33-9baf-2fa2761f61ca

📥 Commits

Reviewing files that changed from the base of the PR and between 2dc8bc7 and 5baf7fc.

📒 Files selected for processing (3)
  • src/openhuman/memory/mod.rs
  • src/openhuman/memory/store/factories.rs
  • src/openhuman/memory/store/mod.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/memory/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/memory/store/mod.rs
  • src/openhuman/memory/store/factories.rs

📝 Walkthrough

Walkthrough

Adds an async Ollama /api/tags reachability probe and a probed embedding-settings resolver that falls back to cloud embeddings when Ollama is unreachable; integrates a blocking probe shim into create_memory_full, re-exports the new APIs, and adds Axum-backed tests and unit tests.

Changes

Ollama Health Check and Embedding Settings

Layer / File(s) Summary
Ollama probe + health gate
src/openhuman/memory/store/factories.rs
Adds a once-per-process Sentry latch and reporter, base-URL and host-redaction helpers, probe_ollama_reachable (short-timeout GET /api/tags), and effective_embedding_settings_probed which probes Ollama and returns cloud fallback when unreachable.
Blocking probe shim & factory integration
src/openhuman/memory/store/factories.rs
Adds probe_ollama_reachable_blocking (sync wrapper that uses tokio::task::block_in_place when on a multi-thread runtime) and updates create_memory_full to health-gate embedding-provider selection before instantiating UnifiedMemory.
Public API re-exports
src/openhuman/memory/store/mod.rs, src/openhuman/memory/mod.rs
Re-exports probe_ollama_reachable and effective_embedding_settings_probed from factories through store and memory public modules.
Tests, EnvGuard, and mock Ollama server
src/openhuman/memory/store/factories.rs
Adds EnvGuard for test env isolation, Axum mock /api/tags server, async tests for reachable/unreachable/non-2xx probe outcomes, tests for effective_embedding_settings_probed fallback/preserve behavior, redaction unit test, and once-per-process gate unit test.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant effective_embedding_settings_probed
  participant probe_ollama_reachable
  participant OllamaAPI
  participant report_ollama_health_gate_once

  Caller->>effective_embedding_settings_probed: request (MemoryConfig, Option<LocalAiConfig>)
  effective_embedding_settings_probed->>effective_embedding_settings_probed: resolve intended (provider, model, dims)
  alt intended provider == "ollama"
    effective_embedding_settings_probed->>probe_ollama_reachable: GET <base_url>/api/tags (short timeout)
    probe_ollama_reachable->>OllamaAPI: HTTP GET /api/tags
    alt 2xx
      OllamaAPI-->>probe_ollama_reachable: 2xx
      probe_ollama_reachable-->>effective_embedding_settings_probed: reachable
      effective_embedding_settings_probed-->>Caller: return local embeddings
    else non-2xx / timeout
      OllamaAPI-->>probe_ollama_reachable: error/timeout
      probe_ollama_reachable-->>effective_embedding_settings_probed: unreachable
      effective_embedding_settings_probed->>report_ollama_health_gate_once: record once-per-process event
      effective_embedding_settings_probed-->>Caller: return cloud fallback embeddings
    end
  else
    effective_embedding_settings_probed-->>Caller: return intended settings
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

A rabbit taps the daemon's door,
Sniffs the tags, then checks once more.
If Ollama answers, local we'll keep,
If it is quiet, to cloud we leap.
🐰🔎

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a health-gate probe for Ollama embeddings with fallback to cloud. It is concise, specific, and clearly conveys the primary objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/memory/store/factories.rs`:
- Around line 69-78: The debug logs in probe_ollama_reachable (and similar
probes) print the full `url`/`base_url` which may include secrets; replace those
direct uses with a redacted form (e.g., origin or host:port only) by adding a
small helper (e.g., redact_url or canonical_host_port) and use that helper in
the `log::debug!` calls inside `probe_ollama_reachable` match arms and the other
referenced locations (around lines 148-166 and 294-308) so that logs show a
stable, grep-friendly prefix (e.g., "[memory::factory] probe_ollama_reachable")
plus the redacted host instead of full URL or query/userinfo.
- Around line 159-167: The code currently calls
crate::core::observability::report_error unconditionally in the Ollama fallback
branches (seen in effective_embedding_settings_probed and the other branch
around lines 301-309), causing repeated "memory.ollama_health_gate" reports;
wrap those report_error calls with a single-session guard (e.g., a module-level
static AtomicBool or OnceCell like OLLAMA_HEALTH_REPORTED) and only call
report_error when compare-and-swap flips it from false→true, setting the flag
afterwards so subsequent constructions/calls skip reporting; apply the same
guarded logic to both fallback locations so the error is emitted once per
session.
- Around line 432-470: These tests mutate the process-global
OPENHUMAN_OLLAMA_BASE_URL and currently race; fix by serializing them and
restoring the prior env var: add a serial test guard (e.g., annotate both test
functions probed_settings_fall_back_to_cloud_when_ollama_unreachable and
probed_settings_keep_ollama_when_daemon_responds with a serialization attribute
such as #[serial] or use a global Mutex) and at the start of each test capture
the previous value with let prev =
std::env::var_os("OPENHUMAN_OLLAMA_BASE_URL"); then set the test value, run
effective_embedding_settings_probed / start_mock_ollama as before, and in a
finally-like cleanup restore the original with if let Some(v) = prev {
std::env::set_var("OPENHUMAN_OLLAMA_BASE_URL", v) } else {
std::env::remove_var("OPENHUMAN_OLLAMA_BASE_URL") } so the env is not clobbered
for other tests.
- Around line 254-265: probe_ollama_reachable_blocking currently calls
tokio::task::block_in_place unguarded which will panic on a current-thread
runtime; update probe_ollama_reachable_blocking to check handle.runtime_flavor()
and if it is not RuntimeFlavor::MultiThread return true (skip probe), then only
call tokio::task::block_in_place when the runtime flavor is MultiThread,
mirroring the pattern used in builder.rs and keeping the existing behavior when
skipping the probe.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 033a8d15-617f-4697-aea4-7a446856f2c0

📥 Commits

Reviewing files that changed from the base of the PR and between 15c7442 and 9aecc99.

📒 Files selected for processing (3)
  • src/openhuman/memory/mod.rs
  • src/openhuman/memory/store/factories.rs
  • src/openhuman/memory/store/mod.rs

Comment thread src/openhuman/memory/store/factories.rs
Comment thread src/openhuman/memory/store/factories.rs Outdated
Comment thread src/openhuman/memory/store/factories.rs
Comment thread src/openhuman/memory/store/factories.rs Outdated
- Sentry report is now fired at most once per process via
  `OLLAMA_HEALTH_REPORTED` atomic flag. Both fallback sites (the
  async `effective_embedding_settings_probed` and the sync
  `create_memory_full`) funnel through `report_ollama_health_gate_once`
  so re-instantiating memory across agents/sessions can't recreate
  the per-embed flood the gate exists to suppress.
- `probe_ollama_reachable_blocking` now checks `runtime_flavor()` and
  skips the probe on a current-thread runtime — `block_in_place`
  panics there. Multi-thread path is unchanged.
- Env-mutating tests now hold `TEST_ENV_LOCK` and restore the previous
  value of `OPENHUMAN_OLLAMA_BASE_URL` via an `EnvGuard` RAII helper,
  so they can no longer race each other or pollute sibling tests.
- New test `ollama_health_gate_reports_at_most_once_per_process`
  asserts the one-shot contract directly.

Url-redaction finding skipped: Ollama base URLs are local user-
configured infrastructure (Ollama has no auth model). Stripping
host/path would harm diagnosis on the `ollama_health_gate` event,
which is the whole point of keeping that signal in Sentry.

Refs OPENHUMAN-TAURI-B7
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — fix(memory): health-gate Ollama embeddings (OPENHUMAN-TAURI-B7)

Approach is sound — probing once at factory time and falling back to cloud is the right fix for the 226-event Sentry flood. The AtomicBool latch, RuntimeFlavor guard, and EnvGuard RAII are all well-executed. A few items worth addressing below.

Verified / looks good

  • AtomicBool ordering (AcqRel/Acquire) is correct for a once-per-process latch
  • RuntimeFlavor::MultiThread check before block_in_place — matches Tokio docs
  • 2s timeout for a localhost probe — reasonable
  • EnvGuard + TEST_ENV_LOCK — proper test isolation
  • 8 new tests cover probe (healthy/unreachable/non-2xx), fallback paths, once-per-process
  • Module layout correct — no standalone files at src/openhuman/ root
  • No hardcoded PII in test fixtures

}
}
"http://localhost:11434".to_string()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] ollama_base_url_for_probe duplicates local_ai::ollama_base_url verbatim

This is a line-for-line copy of crate::openhuman::local_ai::ollama_base_url() which already lives in local_ai/ollama_api.rs and is pub(crate). Two implementations of the same env-var lookup will diverge — if someone adds a third env var override or changes precedence in ollama_api.rs, the memory probe will silently use a different URL than every other Ollama call in the process.

The comment on line 62–65 acknowledges this ("keeping the lookup logic in sync") but that sync guarantee is manual and fragile.

Suggested change
}
fn ollama_base_url_for_probe() -> String {
// Delegate to the canonical implementation so the probe always
// agrees with the rest of the Ollama machinery on the daemon address.
crate::openhuman::local_ai::ollama_base_url()
}

Comment thread src/openhuman/memory/store/factories.rs Outdated
message.as_str(),
"memory",
"ollama_health_gate",
&[("base_url", base_url), ("fallback", "cloud")],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] base_url sent as a Sentry tag — high-cardinality, may contain credentials

The full base_url is passed as a Sentry indexed tag via ("base_url", base_url). If the URL includes userinfo (http://user:secret@host:11434), it gets indexed in Sentry. Even without credentials, full IP/hostname makes the tag high-cardinality, preventing Sentry from grouping across users.

CodeRabbit flagged the log-line variant of this — the Sentry tag is a second, distinct instance. The full URL is fine in the error message body for diagnostics; the tag should be a low-cardinality redacted form (e.g. host:port only):

let host_tag = base_url
    .trim_start_matches("https://")
    .trim_start_matches("http://")
    .split(['/', '@', '?', '#'])
    .next()
    .unwrap_or("unknown");
crate::core::observability::report_error(
    message.as_str(),
    "memory",
    "ollama_health_gate",
    &[("ollama_host", host_tag), ("fallback", "cloud")],
);

}
} else {
intended
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] Probe+fallback logic duplicated between effective_embedding_settings_probed and create_memory_full

These two paths are supposed to behave identically but they diverge:

  1. effective_embedding_settings_probed (async, line 172) always probes.
  2. create_memory_full (sync, here) uses probe_ollama_reachable_blocking which silently skips the probe and returns true on current-thread runtimes — so the health-gate is bypassed entirely for those callers and the Sentry flood can resurface.

Also, the fallback tuple ("cloud", DEFAULT_CLOUD_EMBEDDING_MODEL, DEFAULT_CLOUD_EMBEDDING_DIMENSIONS) is written out in both places — if the cloud defaults change, one site could be missed.

Consider either:

  • Making create_memory_full async and calling effective_embedding_settings_probed directly, or
  • Extracting the fallback tuple into a cloud_embedding_fallback() helper to keep both sites in sync

Comment thread src/openhuman/memory/store/factories.rs Outdated
/// Kept deliberately small and side-effect-free so it can be called from
/// the memory factory's startup path without pulling in the full
/// `local_ai::service::ollama_admin` machinery.
pub async fn probe_ollama_reachable(base_url: &str) -> bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] probe_ollama_reachable is pub but has no external callers

Re-exported all the way to memory/mod.rs as a public API, but the only callers in this diff are internal (effective_embedding_settings_probed and probe_ollama_reachable_blocking). Making it public implies a stable contract, but the doc doesn't cover the indirect OLLAMA_HEALTH_REPORTED side-effect callers might trigger.

Consider pub(crate) to match how ollama_base_url is scoped in local_ai, unless there's a planned external caller.

/// re-create the per-embed flood the gate exists to suppress — just with a
/// different message. The first failed probe trips this flag; subsequent
/// probes log at debug level and skip the Sentry report.
static OLLAMA_HEALTH_REPORTED: AtomicBool = AtomicBool::new(false);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] OLLAMA_HEALTH_REPORTED — no #[cfg(test)] reset helper

The latch is never reset between tests. ollama_health_gate_reports_at_most_once_per_process (line 542) manually resets it, which works — but if a different test triggers the fallback path first (e.g. probed_settings_fall_back_to_cloud_when_ollama_unreachable), the latch is already tripped and the once-per-process invariant can't be verified.

A small #[cfg(test)] fn reset_health_gate_for_test() helper would make this explicit and prevent accidental ordering dependencies across the test suite.

- Delegate `ollama_base_url_for_probe` to `local_ai::ollama_base_url` so the
  probe always agrees with the rest of the Ollama machinery on the daemon
  address (was a verbatim duplicate that would silently diverge).
- Redact the Sentry `base_url` tag to a low-cardinality `host[:port]` value
  via `redact_ollama_host`; full URL stays in the message body. Avoids
  indexing credentials/userinfo and prevents per-instance tag explosion.
- Extract `cloud_embedding_fallback()` so the sync (`create_memory_full`)
  and async (`effective_embedding_settings_probed`) gate sites can't drift
  if the cloud defaults change.
- Demote `probe_ollama_reachable` to `pub(crate)` (matches
  `local_ai::ollama_base_url`) and drop its re-exports from `memory::store`
  and `memory`. External callers should use `effective_embedding_settings_probed`.
- Add `#[cfg(test)] reset_health_gate_for_test()` so any test exercising a
  fallback path can be ordering-independent. Both the once-per-process test
  and the unreachable-fallback test now call it; the once-per-process test
  also takes `TEST_ENV_LOCK` to serialize with the latch-tripping `probed_*`
  tests (caught a real parallel-execution race in this run).
- Add unit test for `redact_ollama_host` covering scheme, userinfo, path,
  query, fragment, scheme-less, and empty inputs.
@CodeGhost21
Copy link
Copy Markdown
Contributor Author

Pushed 5baf7fc0 addressing all five review items.

# Comment Resolution
1 ollama_base_url_for_probe duplicates local_ai::ollama_base_url Now delegates to the canonical impl — both points always agree on the daemon address.
2 Full base_url as a Sentry tag (cardinality / credentials) Added redact_ollama_host(); tag is now ollama_host=host[:port], full URL kept only in the message body. Unit test covers scheme/userinfo/path/query/fragment/scheme-less/empty inputs.
3 Probe+fallback duplicated between async and sync paths Extracted cloud_embedding_fallback() helper used by both call sites — cloud-default changes can't drift. The sync-path block_in_place-skip on current-thread runtimes is intentional (only hits tests; production is multi-thread) and remains documented at the helper.
4 probe_ollama_reachable pub with no external callers Demoted to pub(crate) (matches local_ai::ollama_base_url's scope) and removed re-exports from memory::store and memory. The stable external API is effective_embedding_settings_probed.
5 No #[cfg(test)] reset helper for OLLAMA_HEALTH_REPORTED Added reset_health_gate_for_test(). Called by both probed_settings_fall_back_to_cloud_when_ollama_unreachable and ollama_health_gate_reports_at_most_once_per_process. The once-per-process test additionally takes TEST_ENV_LOCK — this caught a real ordering race in this run (parallel-suite reset between the two report_*_once calls flaked the suppression assertion).

cargo test --lib memory::store::factories → 10 passed, 0 failed locally. cargo check --tests and cargo fmt --check clean.

The two Rust Core jobs that hung 4h+ on the previous run were force-cancelled by @senamakel and weren't assertion failures — the fresh CI on this push should resolve them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants