perf: move inference logging to background writer thread by isaacbmiller · Pull Request #43 · cmpnd-ai/dspy-cli

isaacbmiller · 2026-02-28T19:53:45Z

Split from #38 (4/5)

Changes

Replace synchronous per-request file I/O in log_inference() with a queue-based background writer thread. The writer batches up to 50 entries per flush and groups writes by program to minimize file opens.

Before

Every log_inference() call opened the log file, wrote one JSON line, and closed it -- on the request thread/event loop.

After

log_inference() enqueues a tuple and returns immediately. A daemon thread drains the queue and writes in batches.

Simplifications vs original PR

Uses _SENTINEL = object() instead of None -- prevents accidental shutdown if None is enqueued
Extracted _drain_remaining() helper for clarity

Files changed

src/dspy_cli/server/logging.py -- background writer + queue
src/dspy_cli/server/app.py -- start/stop writer in lifecycle

Testing

All 123 existing tests pass.

Replace synchronous per-request file I/O with a queue-based background writer that batches up to 50 entries per flush. Uses a dedicated sentinel object (not None) for clean shutdown. log_inference() now never blocks the caller. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 43d4161ec1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-28T19:56:05Z

+        key = (str(logs_dir), program_name)
+        if key not in grouped:
+            grouped[key] = []
+        grouped[key].append(json.dumps(log_entry))


Catch serialization errors before they kill log writer

_flush_entries() now calls json.dumps(log_entry) outside the try block, so any non-JSON-serializable value in inputs, outputs, or lm_calls raises TypeError and terminates the background thread; after that, all subsequent inference logs are silently dropped because nothing drains _log_queue. The previous synchronous path caught this exception and continued serving requests, so this is a regression that can break metrics/training log collection for the rest of the process after a single malformed payload.

Useful? React with 👍 / 👎.

isaacbmiller · 2026-02-28T21:04:04Z

Dropping -- log_inference does one sub-millisecond file append per multi-second LLM request. The background writer adds 130 lines of queue/thread/sentinel/batching complexity to save microseconds.

chatgpt-codex-connector Bot reviewed Feb 28, 2026

View reviewed changes

isaacbmiller closed this Feb 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: move inference logging to background writer thread#43

perf: move inference logging to background writer thread#43
isaacbmiller wants to merge 1 commit intomainfrom
isaac/perf-background-log-writer

isaacbmiller commented Feb 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Feb 28, 2026

Uh oh!

isaacbmiller commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

isaacbmiller commented Feb 28, 2026

Changes

Before

After

Simplifications vs original PR

Files changed

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

isaacbmiller commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant