Skip to content

Conversation

@martinzink
Copy link
Member

@martinzink martinzink commented Dec 18, 2025

I’ve went ahead and this some testing so we can set a sensible default value for this new configurable value. Based on my testing it seems the processing overhead is negligible, unfortunately the memory gains are also not what one expect (flow specific)

I’ve created a bunch of flows and tried it various cache sizes.

  • Simple GenerateFlowFile → LogAttribute 5MB Text (random)
    • processing throughput remained the same while achieving 10% lower memory footprint on avarage, but the memory usage was not consistent
  • TailFile(30k lines) → MergeFile → LogAttribute, with GetFile → LogAttribute (to increase memory usage)
    • Processing time of the tailfile remained within run-to-run variance with 20-30% lower memory footprint

I’ve also noticed that compaction time and periodic https://man7.org/linux/man-pages/man3/malloc_trim.3.html makes memory footprint even smaller (much more noticably in some cases)

Methodology
Processing time was measured via log messages, memory footprint was measured via prometheus AgentStatus metrics
Baseline was setting the new variable to an empty string (which basically mimics the current main behaviour)

I couldn’t measure performance losses but only tried with a small set of relatively simple flows, since this PR now changes the default behavior its important that we emphasise this change and the method to disable this in the release logs.

@martinzink martinzink marked this pull request as draft January 8, 2026 16:03
@martinzink martinzink marked this pull request as ready for review January 26, 2026 09:36
@martinzink martinzink requested review from fgerlits and szaszm January 26, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants