Summary
On Windows, an out_forward output pointing at an upstream that stays unreachable accumulates a large number of retry timers (each a socketpair + libevent event on the engine event base). The monkey libevent backend (lib/monkey/mk_core/mk_event_libevent.c) collects ready events in a fixed-size ctx->fired array that is allocated once at loop creation (256 entries for the engine loop) and never grows, while cb_event appends to it with no bounds check. When more than queue_size events become ready in a single event_base_loop() pass, cb_event writes past the end of the array — an out-of-bounds heap write that corrupts adjacent allocations or, when it reaches the guard/unmapped page, faults directly.
Confirmed on v5.0.6 (latest release) under full page heap: the faulting write lands exactly on the guard page immediately after ctx->fired, and the debugger names the overrun variable fired. The same defect is also present in v4.0.13 (identical fault, identical Windows FAILURE_ID_HASH), so this is a long-standing issue, not a recent regression.
Environment
- Fluent Bit v5.0.6 — official Windows x64 build (
fluent-bit-5.0.6-win64, FileVersion 5.0.6.0); source tag v5.0.6. Crash reproduced in ~7–22 minutes.
- Windows Server 2019 —
10.0.17763, x64, 4 procs.
- Output:
forward with TLS + Upstream, multiple workers; continuous input (tail / Windows event log).
- libevent (bundled) built without thread locking (
evthread_use_* is never called).
- Also reproduced on v4.0.13 (see "Also present in v4.0.13").
Reproduction
Point an out_forward at a black-hole address so every connect runs into the timeout and retries pile up. This happened in production while the destination was not available. The black hole is just to speed up the issue.
[OUTPUT]
Name forward
Match *
Host 10.255.255.1 # black hole, no RST
Port 24224
Retry_Limit false # unlimited retries -> many concurrent retry timers
Drive continuous input so the engine keeps scheduling flushes/retries. The process crashes after minutes. Capture with:
procdump -accepteula -ma -t -e -w fluent-bit.exe C:\dumps
Root cause
The engine event loop is created with a fixed size:
evl = mk_event_loop_create(256); /* src/flb_engine.c */
which allocates the fired array exactly once and records its capacity:
/* _mk_event_loop_create() */
ctx->fired = mk_mem_alloc_z(sizeof(struct mk_event) * size); /* size = 256 */
ctx->queue_size = size;
_mk_event_add() registers further fds into libevent (event_new(... cb_event, event); event_add(...)) without ever growing ctx->fired or queue_size. The number of registered events is therefore unbounded, but the fired array stays at 256.
cb_event() appends one entry per ready event, with no bounds check:
/* cb_event(), mk_event_libevent.c */
i = ctx->fired_count;
fired = &ctx->fired[i];
fired->fd = event->fd; /* line 99 */
fired->mask = mask; /* line 100 */
fired->data = event;
ctx->fired_count++;
fired_count is reset to 0 before each loop and counts up across all events fired in that pass:
/* _mk_event_wait_with_flags() */
ctx->fired_count = 0;
event_base_loop(ctx->base, flags);
When more than queue_size (256) events become ready in a single pass, &ctx->fired[fired_count] walks past the allocation and cb_event corrupts whatever follows it on the heap. (The same unchecked append exists in _mk_event_inject().)
Faulting dump (v5.0.6, without page heap)
fluent_bit!cb_event+0xa8 [mk_event_libevent.c @ 100] <-- mov [rax+8],ecx
fluent_bit!event_persist_closure+0x2f6 [libevent/event.c @ 1580]
fluent_bit!event_process_active_single_queue [libevent/event.c @ 1639]
fluent_bit!event_process_active [libevent/event.c @ 1738]
fluent_bit!event_base_loop+0x296 [libevent/event.c @ 1961]
fluent_bit!_mk_event_wait_with_flags+0x3a [mk_event_libevent.c @ 456]
fluent_bit!mk_event_wait / flb_engine_start [src/flb_engine.c @ 1141]
rax = 0x0000023e9628eff8, write to [rax+8] = 0x0000023e9628f000 (next, unmapped page), ecx = 1 (MK_EVENT_READ). struct mk_event is { int fd; int type; uint32_t mask; ... }, so offset 8 is mask — the faulting instruction is exactly fired->mask = mask, with fired = &ctx->fired[fired_count] at the end of the 256-entry allocation.
Confirmed under full page heap (v5.0.6)
Re-run with full page heap enabled (gflags /p /enable fluent-bit.exe /full; NTGLOBALFLAG: 2000000, APPLICATION_VERIFIER_LOADED: 1):
fluent_bit!cb_event+0x9d [mk_event_libevent.c @ 99] mov dword ptr [rax],ecx
FAULTING_LOCAL_VARIABLE_NAME: fired
FAILURE_BUCKET_ID: INVALID_POINTER_WRITE_AVRF_c0000005_fluent-bit.exe!cb_event
rax = 0x0000025a923e6000 is exactly page-aligned — the page-heap guard page placed immediately after the ctx->fired allocation. With page heap the fault now occurs on the first field write of the entry (fired->fd = event->fd, line 99, ecx = the fd value) rather than fired->mask, because the whole entry now starts past the array end. The debugger names the overrun target directly: FAULTING_LOCAL_VARIABLE_NAME: fired. The guard-page alignment places this write at index 256 of the 256-entry array — the append for the 257th simultaneously-ready event in one event_base_loop pass. This is a definitive heap buffer overrun of ctx->fired, not a use-after-free.
Also present in v4.0.13
Under full page heap, v4.0.13 faults identically (cb_event, line 99, FAULTING_LOCAL_VARIABLE_NAME: fired, write on the guard page after ctx->fired) and Windows assigns it the same FAILURE_ID_HASH as the v5.0.6 page-heap crash — i.e. it is classified as the same defect. Without page heap the overrun surfaced in v4.0.13 as roaming corruption of adjacent structures (a libevent timer min-heap and the engine event priority queue, with stray fd-range integers and partial-pointer overwrites), consistent with struct mk_event entries written past ctx->fired. The defect is unchanged across releases.
Secondary defect — timeout teardown (also present in v5.0.6)
Independent of the overflow, the timer teardown double-closes the read-end fd and has two owners freeing the same ev_map:
_mk_event_timeout_destroy() closes event->fd (= ev_map->pipe[0]) without nulling it, then calls _mk_event_del(), which closes ev_map->pipe[0] again. On Windows the fd is reused immediately, so the second close can hit a socket now owned by another event_base.
cb_timeout() self-frees ev_map on send failure while _mk_event_del() also frees it (double-free / UAF).
Worth fixing in the same pass, but not the corruptor demonstrated above.
Suggested fixes
1. Bound / grow the fired array (primary)
The number of events that can fire in one event_base_loop() pass equals the number of registered events, which is unbounded — so the fixed-capacity fired array must grow with it. Both append sites (cb_event and _mk_event_inject) need the guard, so factor it into one helper in mk_event_libevent.c:
/* Append a fired event, growing ctx->fired on demand so it can never
* overflow when more events fire in one loop pass than queue_size. */
static inline int mk_event_fired_push(struct mk_event_ctx *ctx,
evutil_socket_t fd, int mask,
struct mk_event *event)
{
struct mk_event *tmp;
int new_size;
if (ctx->fired_count >= ctx->queue_size) {
new_size = (ctx->queue_size > 0) ? (ctx->queue_size * 2) : 256;
tmp = mk_mem_realloc(ctx->fired, sizeof(struct mk_event) * new_size);
if (tmp == NULL) {
return -1; /* OOM: drop rather than overflow */
}
ctx->fired = tmp;
ctx->queue_size = new_size;
}
ctx->fired[ctx->fired_count].fd = fd;
ctx->fired[ctx->fired_count].mask = mask;
ctx->fired[ctx->fired_count].data = event;
ctx->fired_count++;
return 0;
}
cb_event then becomes:
static void cb_event(evutil_socket_t fd, short flags, void *data)
{
int mask = 0;
struct mk_event *event = data;
struct ev_map *map = event->data;
if (flags & EV_READ) mask |= MK_EVENT_READ;
if (flags & EV_WRITE) mask |= MK_EVENT_WRITE;
mk_event_fired_push(map->ctx, event->fd, mask, event);
}
and the append block in _mk_event_inject:
event->mask = mask;
if (mk_event_fired_push(ctx, event->fd, mask, event) == 0) {
loop->n_events++;
}
return 0;
The mk_mem_realloc happens inside cb_event during event_base_loop(), but that is safe: no pointer into ctx->fired is cached across cb_event calls (each call re-indexes ctx->fired[ctx->fired_count]), libevent holds no pointer into it, and the consumer reads ctx->fired only after the loop returns. Simply raising the static 256 in mk_event_loop_create() is not a fix — it only moves the threshold.
2. Single-owner timeout teardown (secondary)
static inline int _mk_event_timeout_destroy(struct mk_event_ctx *ctx, void *data)
{
if (data == NULL) {
return 0;
}
/* _mk_event_del() is the single owner: it closes both pipe ends, sets them
* to -1, and frees the event + ev_map exactly once. Do NOT pre-close
* event->fd here -- it aliases ev_map->pipe[0] and would be closed twice
* (the second close can hit a fd already reused by another event_base). */
return _mk_event_del(ctx, (struct mk_event *) data);
}
static void cb_timeout(evutil_socket_t fd, short flags, void *data)
{
uint64_t val = 1;
struct ev_map *ev_map = data;
/* Signal only. Lifetime is owned solely by the explicit destroy path; never
* free here, or it races with _mk_event_del() and double-frees ev_map. */
(void) send(ev_map->pipe[1], (char *) &val, sizeof(uint64_t), 0);
}
This makes the explicit destroy path the sole owner. It assumes every timeout is torn down via mk_event_timeout_destroy(); any timeout that relied on cb_timeout's self-cleanup (on read-end close) should be reviewed before adopting this.
Capture notes (for maintainers)
The overflow is confirmed under full page heap (see "Confirmed under full page heap" above): the fault lands on the guard page immediately after ctx->fired, and the debugger identifies the overrun variable as fired. !heap -p -a <addr-inside-block> shows the offending allocation originates from _mk_event_loop_create (mk_mem_alloc_z(sizeof(struct mk_event) * 256)).
Workaround for affected users (mitigation, not a fix)
Keep the number of simultaneously-ready events on the engine loop well under the 256-entry fired capacity:
- lower
storage.max_chunks_up below 256 (default 128) — caps the concurrent up-chunk/task/timer population
- finite
Retry_Limit on the forward output — failed chunks leave the retry population instead of accumulating
storage.total_limit_size to bound the per-output backlog (drops oldest chunks)
log_level info to cut log-pipe pressure
These keep the load under the overflow threshold but do not remove the bug; safety depends on chunk size and burst patterns. Lowering log_level alone did not prevent the crash in testing.
Summary
On Windows, an
out_forwardoutput pointing at an upstream that stays unreachable accumulates a large number of retry timers (each a socketpair + libevent event on the engine event base). The monkey libevent backend (lib/monkey/mk_core/mk_event_libevent.c) collects ready events in a fixed-sizectx->firedarray that is allocated once at loop creation (256 entries for the engine loop) and never grows, whilecb_eventappends to it with no bounds check. When more thanqueue_sizeevents become ready in a singleevent_base_loop()pass,cb_eventwrites past the end of the array — an out-of-bounds heap write that corrupts adjacent allocations or, when it reaches the guard/unmapped page, faults directly.Confirmed on v5.0.6 (latest release) under full page heap: the faulting write lands exactly on the guard page immediately after
ctx->fired, and the debugger names the overrun variablefired. The same defect is also present in v4.0.13 (identical fault, identical WindowsFAILURE_ID_HASH), so this is a long-standing issue, not a recent regression.Environment
fluent-bit-5.0.6-win64,FileVersion 5.0.6.0); source tagv5.0.6. Crash reproduced in ~7–22 minutes.10.0.17763, x64, 4 procs.forwardwith TLS +Upstream, multiple workers; continuous input (tail/ Windows event log).evthread_use_*is never called).Reproduction
Point an
out_forwardat a black-hole address so every connect runs into the timeout and retries pile up. This happened in production while the destination was not available. The black hole is just to speed up the issue.Drive continuous input so the engine keeps scheduling flushes/retries. The process crashes after minutes. Capture with:
procdump -accepteula -ma -t -e -w fluent-bit.exe C:\dumpsRoot cause
The engine event loop is created with a fixed size:
which allocates the fired array exactly once and records its capacity:
_mk_event_add()registers further fds into libevent (event_new(... cb_event, event); event_add(...)) without ever growingctx->firedorqueue_size. The number of registered events is therefore unbounded, but the fired array stays at 256.cb_event()appends one entry per ready event, with no bounds check:fired_countis reset to 0 before each loop and counts up across all events fired in that pass:When more than
queue_size(256) events become ready in a single pass,&ctx->fired[fired_count]walks past the allocation andcb_eventcorrupts whatever follows it on the heap. (The same unchecked append exists in_mk_event_inject().)Faulting dump (v5.0.6, without page heap)
rax = 0x0000023e9628eff8, write to[rax+8] = 0x0000023e9628f000(next, unmapped page),ecx = 1(MK_EVENT_READ).struct mk_eventis{ int fd; int type; uint32_t mask; ... }, so offset 8 ismask— the faulting instruction is exactlyfired->mask = mask, withfired = &ctx->fired[fired_count]at the end of the 256-entry allocation.Confirmed under full page heap (v5.0.6)
Re-run with full page heap enabled (
gflags /p /enable fluent-bit.exe /full;NTGLOBALFLAG: 2000000,APPLICATION_VERIFIER_LOADED: 1):rax = 0x0000025a923e6000is exactly page-aligned — the page-heap guard page placed immediately after thectx->firedallocation. With page heap the fault now occurs on the first field write of the entry (fired->fd = event->fd, line 99,ecx= the fd value) rather thanfired->mask, because the whole entry now starts past the array end. The debugger names the overrun target directly:FAULTING_LOCAL_VARIABLE_NAME: fired. The guard-page alignment places this write at index 256 of the 256-entry array — the append for the 257th simultaneously-ready event in oneevent_base_looppass. This is a definitive heap buffer overrun ofctx->fired, not a use-after-free.Also present in v4.0.13
Under full page heap, v4.0.13 faults identically (
cb_event, line 99,FAULTING_LOCAL_VARIABLE_NAME: fired, write on the guard page afterctx->fired) and Windows assigns it the sameFAILURE_ID_HASHas the v5.0.6 page-heap crash — i.e. it is classified as the same defect. Without page heap the overrun surfaced in v4.0.13 as roaming corruption of adjacent structures (a libevent timer min-heap and the engine event priority queue, with stray fd-range integers and partial-pointer overwrites), consistent withstruct mk_evententries written pastctx->fired. The defect is unchanged across releases.Secondary defect — timeout teardown (also present in v5.0.6)
Independent of the overflow, the timer teardown double-closes the read-end fd and has two owners freeing the same
ev_map:_mk_event_timeout_destroy()closesevent->fd(=ev_map->pipe[0]) without nulling it, then calls_mk_event_del(), which closesev_map->pipe[0]again. On Windows the fd is reused immediately, so the second close can hit a socket now owned by anotherevent_base.cb_timeout()self-freesev_maponsendfailure while_mk_event_del()also frees it (double-free / UAF).Worth fixing in the same pass, but not the corruptor demonstrated above.
Suggested fixes
1. Bound / grow the
firedarray (primary)The number of events that can fire in one
event_base_loop()pass equals the number of registered events, which is unbounded — so the fixed-capacityfiredarray must grow with it. Both append sites (cb_eventand_mk_event_inject) need the guard, so factor it into one helper inmk_event_libevent.c:cb_eventthen becomes:and the append block in
_mk_event_inject:The
mk_mem_reallochappens insidecb_eventduringevent_base_loop(), but that is safe: no pointer intoctx->firedis cached acrosscb_eventcalls (each call re-indexesctx->fired[ctx->fired_count]), libevent holds no pointer into it, and the consumer readsctx->firedonly after the loop returns. Simply raising the static256inmk_event_loop_create()is not a fix — it only moves the threshold.2. Single-owner timeout teardown (secondary)
This makes the explicit destroy path the sole owner. It assumes every timeout is torn down via
mk_event_timeout_destroy(); any timeout that relied oncb_timeout's self-cleanup (on read-end close) should be reviewed before adopting this.Capture notes (for maintainers)
The overflow is confirmed under full page heap (see "Confirmed under full page heap" above): the fault lands on the guard page immediately after
ctx->fired, and the debugger identifies the overrun variable asfired.!heap -p -a <addr-inside-block>shows the offending allocation originates from_mk_event_loop_create(mk_mem_alloc_z(sizeof(struct mk_event) * 256)).Workaround for affected users (mitigation, not a fix)
Keep the number of simultaneously-ready events on the engine loop well under the 256-entry
firedcapacity:storage.max_chunks_upbelow 256 (default 128) — caps the concurrent up-chunk/task/timer populationRetry_Limiton the forward output — failed chunks leave the retry population instead of accumulatingstorage.total_limit_sizeto bound the per-output backlog (drops oldest chunks)log_level infoto cut log-pipe pressureThese keep the load under the overflow threshold but do not remove the bug; safety depends on chunk size and burst patterns. Lowering
log_levelalone did not prevent the crash in testing.