Summary
codexUI can become extremely slow over time while codexui.service grows to multi-GB cgroup memory on a host that is otherwise healthy.
In my case, the service repeatedly climbed into the ~6-10 GB range and the UI became sluggish enough that users complained it was barely usable. Restarting the service immediately restored responsiveness and dropped memory back to tens of MB.
The important part: this did not look like a classic anonymous-heap leak in Node or the Codex app-server process. The footprint was overwhelmingly file-backed cache inside the service cgroup.
Actual behavior
During bad windows, I observed:
- codexUI becomes server-side slow / laggy
codexui.service memory grows to roughly:
6.0-6.5 GiB with peak 7.6-8.2 GiB
- later recurrence up to about
9.6 GiB with peak around 11.2 GiB
memory.stat showed mostly file cache, not anon heap
- one sample:
anon 389500928, file 6010097664
- later sample:
anon ~253 MB, file ~9.27 GB
- host itself stayed healthy
- load average around
1
free -h still showed ~24 GiB available
- no codexUI cgroup OOM events
- local HTTP health still returned
200 OK even while the UI felt degraded
After restart:
- responsiveness returned immediately
MemoryCurrent dropped to ~24-31 MB
TasksCurrent dropped to 11
Expected behavior
codexUI should stay responsive even with a large Codex history, and it should not accumulate multi-GB file-backed memory inside the service cgroup just from normal UI usage.
Why I think this is codexUI-side, not just “my host is low on RAM”
The host had plenty of free memory and low load during the incident. The memory growth was isolated to codexui.service, and restart of just that service reliably fixed the symptom.
Also, the visible long-lived processes did not account for the full footprint:
- Node process was modest
- Codex app-server RSS was only a few hundred MB
- but service cgroup memory was 6-10+ GB
That points much more toward repeated large file reads / cache accumulation than a straightforward process heap leak.
Evidence from code: likely hot paths
I do not want to overclaim the exact single root cause, but there are several concrete code paths in 0.1.78 that look capable of causing this with a large ~/.codex history.
1) Thread search builds a full in-memory index by reading every thread with includeTurns: true
In the packaged backend bundle:
dist-cli/index.js:5402-5459
dist-cli/index.js:6085-6096
loadAllThreadsForSearch(appServer):
- pages through
thread/list
- collects every thread
- then calls
thread/read with includeTurns: true for every thread
- extracts message text into
searchableText
- stores the docs in a map for later search
Relevant shape:
const response = await appServer.rpc("thread/list", { archived: false, limit: 100, sortKey: "updated_at", cursor })
...
const readResponse = await appServer.rpc("thread/read", { threadId: thread.id, includeTurns: true })
const messageText = extractThreadMessageText(readResponse)
Then /codex-api/thread-search does:
const index = await getThreadSearchIndex()
which lazily builds that full index on first non-empty search.
If a user has a large history, this can become very expensive.
2) Thread live-state / file-change endpoints read the session log file from disk as full UTF-8 text
In the packaged backend bundle:
dist-cli/index.js:5526-5545
dist-cli/index.js:5563-5584
Both endpoints call thread/read with includeTurns: true, then if a session path exists they do:
const sessionLogRaw = await readFile3(sessionPath, "utf8")
That means a large rollout/session log can be slurped into memory as a full string.
3) Account routes appear to trigger background refresh / inspection work
In the packaged backend bundle:
dist-cli/index.js:768-845
GET /codex-api/accounts calls:
const state = await scheduleAccountsBackgroundRefresh()
Separately, during one slowdown window, the service cgroup also contained hot Chrome/Playwright-like children and temp scripts such as:
/tmp/chatgpt-account-check.mjs
/tmp/chatgpt-assets.mjs
So there may also be account/inspection-related churn contributing to the bloat.
My environment had large history files, which likely amplified the problem
Examples from the same machine:
~/.codex/sessions/.../rollout-...jsonl at 704,253,891 bytes
- another rollout file at
79,693,927 bytes
~/.codex/log/codex-tui.log at 76,790,328 bytes
With files of that size, endpoints that repeatedly call thread/read includeTurns=true and/or readFile(sessionPath, "utf8") can plausibly generate a lot of cache churn.
Minimal reproduction direction
I do not yet have a single 100%-minimal upstream repro script, but the live pattern is repeatable enough that I think the bug is real:
- Use codexUI on a machine with a large existing
~/.codex history.
- Browse threads / use the UI normally for a while.
- If thread search is used, trigger a non-empty
/codex-api/thread-search query.
- Observe
codexui.service memory over time.
- The service can become much slower while cgroup memory climbs into multi-GB territory, mostly as
file, not anon.
Suggested fix directions
A few ideas that seem worth investigating:
- Avoid full-history thread search indexing
- do not call
thread/read(includeTurns=true) for every thread just to build the first search index
- cap or page aggressively
- consider incremental indexing or title/preview-only search by default
- Avoid reading huge session logs as one full string
- stream, tail, or bound reads
- put size limits around
readFile(sessionPath, "utf8")
- Add backpressure / bounds for large-history installations
- size-aware limits
- cache eviction
- guardrails when
~/.codex contains very large rollout files
- Add observability
- log when full thread-index builds start / finish
- log how many threads and total bytes were read
- log when account inspection spawns helper processes
Why this is not a duplicate
I searched current issues before filing.
The only remotely adjacent issue I found was:
That issue is about naming/title sources, not service slowness or multi-GB memory/file-cache growth.
I did not find an existing issue specifically covering:
- codexUI slowing down badly over time
codexui.service climbing to 6-10+ GB
- memory dominated by
file rather than anon
- likely interaction with large
~/.codex histories and full-thread reads
Environment
- codexUI:
0.1.78
- Codex CLI:
0.120.0
- Node:
v22.22.2
- OS:
Linux netcup-clawd 6.17.0-14-generic x86_64 (Ubuntu)
- service start shape:
/usr/bin/node .../dist-cli/index.js --port 15999 --password <set> --no-open --no-tunnel --no-login /root/.hermes/workspace
Current healthy-ish post-restart state, for comparison:
MemoryCurrent=582430720
MemoryPeak=1503264768
TasksCurrent=43
systemctl --user status showed Memory: 555M (peak: 1.3G)
Bottom line
This looks like a real performance bug in codexUI when pointed at a large Codex history:
- the service becomes perceptibly slow
- cgroup memory can balloon into multi-GB territory
- the footprint is mostly file-backed cache, not classic process heap
- restart fixes it immediately, but the issue recurs
Happy to provide more data if helpful, but I wanted to get the core evidence and likely code hotspots upstream first.
Summary
codexUI can become extremely slow over time while
codexui.servicegrows to multi-GB cgroup memory on a host that is otherwise healthy.In my case, the service repeatedly climbed into the ~6-10 GB range and the UI became sluggish enough that users complained it was barely usable. Restarting the service immediately restored responsiveness and dropped memory back to tens of MB.
The important part: this did not look like a classic anonymous-heap leak in Node or the Codex app-server process. The footprint was overwhelmingly file-backed cache inside the service cgroup.
Actual behavior
During bad windows, I observed:
codexui.servicememory grows to roughly:6.0-6.5 GiBwith peak7.6-8.2 GiB9.6 GiBwith peak around11.2 GiBmemory.statshowed mostly file cache, not anon heapanon 389500928,file 6010097664anon ~253 MB,file ~9.27 GB1free -hstill showed ~24 GiBavailable200 OKeven while the UI felt degradedAfter restart:
MemoryCurrentdropped to ~24-31 MBTasksCurrentdropped to11Expected behavior
codexUI should stay responsive even with a large Codex history, and it should not accumulate multi-GB file-backed memory inside the service cgroup just from normal UI usage.
Why I think this is codexUI-side, not just “my host is low on RAM”
The host had plenty of free memory and low load during the incident. The memory growth was isolated to
codexui.service, and restart of just that service reliably fixed the symptom.Also, the visible long-lived processes did not account for the full footprint:
That points much more toward repeated large file reads / cache accumulation than a straightforward process heap leak.
Evidence from code: likely hot paths
I do not want to overclaim the exact single root cause, but there are several concrete code paths in 0.1.78 that look capable of causing this with a large
~/.codexhistory.1) Thread search builds a full in-memory index by reading every thread with
includeTurns: trueIn the packaged backend bundle:
dist-cli/index.js:5402-5459dist-cli/index.js:6085-6096loadAllThreadsForSearch(appServer):thread/listthread/readwithincludeTurns: truefor every threadsearchableTextRelevant shape:
Then
/codex-api/thread-searchdoes:which lazily builds that full index on first non-empty search.
If a user has a large history, this can become very expensive.
2) Thread live-state / file-change endpoints read the session log file from disk as full UTF-8 text
In the packaged backend bundle:
dist-cli/index.js:5526-5545dist-cli/index.js:5563-5584Both endpoints call
thread/readwithincludeTurns: true, then if a session path exists they do:That means a large rollout/session log can be slurped into memory as a full string.
3) Account routes appear to trigger background refresh / inspection work
In the packaged backend bundle:
dist-cli/index.js:768-845GET /codex-api/accountscalls:Separately, during one slowdown window, the service cgroup also contained hot Chrome/Playwright-like children and temp scripts such as:
/tmp/chatgpt-account-check.mjs/tmp/chatgpt-assets.mjsSo there may also be account/inspection-related churn contributing to the bloat.
My environment had large history files, which likely amplified the problem
Examples from the same machine:
~/.codex/sessions/.../rollout-...jsonlat704,253,891bytes79,693,927bytes~/.codex/log/codex-tui.logat76,790,328bytesWith files of that size, endpoints that repeatedly call
thread/read includeTurns=trueand/orreadFile(sessionPath, "utf8")can plausibly generate a lot of cache churn.Minimal reproduction direction
I do not yet have a single 100%-minimal upstream repro script, but the live pattern is repeatable enough that I think the bug is real:
~/.codexhistory./codex-api/thread-searchquery.codexui.servicememory over time.file, notanon.Suggested fix directions
A few ideas that seem worth investigating:
thread/read(includeTurns=true)for every thread just to build the first search indexreadFile(sessionPath, "utf8")~/.codexcontains very large rollout filesWhy this is not a duplicate
I searched current issues before filing.
The only remotely adjacent issue I found was:
Use Codex session index titles instead of preview text for thread namesThat issue is about naming/title sources, not service slowness or multi-GB memory/file-cache growth.
I did not find an existing issue specifically covering:
codexui.serviceclimbing to 6-10+ GBfilerather thananon~/.codexhistories and full-thread readsEnvironment
0.1.780.120.0v22.22.2Linux netcup-clawd 6.17.0-14-generic x86_64 (Ubuntu)/usr/bin/node .../dist-cli/index.js --port 15999 --password <set> --no-open --no-tunnel --no-login /root/.hermes/workspaceCurrent healthy-ish post-restart state, for comparison:
MemoryCurrent=582430720MemoryPeak=1503264768TasksCurrent=43systemctl --user statusshowedMemory: 555M (peak: 1.3G)Bottom line
This looks like a real performance bug in codexUI when pointed at a large Codex history:
Happy to provide more data if helpful, but I wanted to get the core evidence and likely code hotspots upstream first.