Skip to content

uffd: add support for UFFD_EVENT_REMOVE events#1896

Open
bchalios wants to merge 18 commits intomainfrom
feat/free-page-reporting
Open

uffd: add support for UFFD_EVENT_REMOVE events#1896
bchalios wants to merge 18 commits intomainfrom
feat/free-page-reporting

Conversation

@bchalios
Copy link
Contributor

@bchalios bchalios commented Feb 11, 2026

Enabling Firecracker free-page-reporting feature requires us to handle remove events (UFFD_EVENT_REMOVE) in our userfaultfd handler. These events are triggered whenever Firecracker calls madvise(MADV_DONTNEED) (or similar) on a range of guest memory addresses.

The main thing that changes on our logic is that page faults in a page that has previously been removed need to be served with a zero page rather than a page from the snapshot file.

This commit changes the page fault serving logic to:

  1. Introduce tracking of the state of every page in the guest's memory mappings.
  2. Add logic to handle the new UFFD_EVENT_REMOVE event
  3. Modify existing logic to take into account current state when deciding how to handle each page fault

This is dependent on the part of #1858 that enables free page reporting on the Firecracker side.


Note

High Risk
High risk because it changes low-level userfaultfd page-fault handling and VM startup configuration, where subtle race/IOCTL edge cases could cause sandbox boot failures or memory correctness issues.

Overview
This PR updates the sandbox stack to support Firecracker free page reporting by bumping the default Firecracker version to v1.14.1 and introducing an end-to-end FreePageReporting toggle (API/proto, template build configs, and CLI with version-based defaulting). It significantly refactors the userfaultfd handler to process UFFD_EVENT_REMOVE, track per-page state, and serve removed pages as zero-filled rather than snapshot-backed data, including new UFFD zero-page support and adjusted memory-mapping helpers/tests to match the new flow.

Written by Cursor Bugbot for commit 5fa75fe. This will update automatically on new commits. Configure here.

@bchalios bchalios force-pushed the feat/free-page-reporting branch 2 times, most recently from a1d0a02 to 03e2966 Compare February 11, 2026 21:20
@bchalios bchalios force-pushed the feat/free-page-reporting branch 5 times, most recently from b766eb3 to 4d197e4 Compare February 12, 2026 23:48
@bchalios bchalios force-pushed the feat/free-page-reporting branch 2 times, most recently from e639acf to 88420f2 Compare February 13, 2026 00:45
@bchalios bchalios force-pushed the feat/free-page-reporting branch 8 times, most recently from d2e5d09 to 8ea97e4 Compare February 17, 2026 15:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades Firecracker from v1.12.1 to v1.14.1 and adds support for UFFD_EVENT_REMOVE events in the userfaultfd handler, which is required to support Firecracker's free-page-reporting feature. When Firecracker calls madvise(MADV_DONTNEED) on guest memory ranges, the kernel emits REMOVE events; subsequent page faults on those addresses must be zero-filled rather than served from the snapshot file.

Changes:

  • Adds UFFD_EVENT_REMOVE handling in the UFFD handler with a new pageTracker to track per-page state (unfaulted, faulted, removed), replaces the old block.Tracker with this new tracker, and introduces deferred fault retry logic for EAGAIN.
  • Bumps Firecracker default version to v1.14.1_aba5ab3 across feature flags, seeds, and CI, and regenerates the go-swagger client/models for the new Firecracker v1.14.1 OpenAPI spec (adding balloon hinting, pmem, serial device, memory hotplug, and structured CPU config).
  • Adds FreePageReporting config flag propagated from the API layer through the orchestrator to Firecracker, enabled by default for v1.14+ sandboxes.

Reviewed changes

Copilot reviewed 68 out of 68 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
userfaultfd/userfaultfd.go Core UFFD handler: adds REMOVE event processing, page state tracking, deferred fault retry, zero-fill for removed pages
userfaultfd/page_tracker.go New per-page state map (unfaulted/faulted/removed)
userfaultfd/deferred.go Thread-safe collection for pagefaults that returned EAGAIN
userfaultfd/prefault.go Extracted prefault logic from main handler
userfaultfd/fd.go New UFFD constants/types: UFFD_EVENT_REMOVE, UFFDIO_ZEROPAGE, UffdRemove, etc.
memory/mapping.go Simplified GetOffset/GetHostVirtAddr to return one fewer value
block/tracker.go + tracker_test.go Deleted (replaced by pageTracker)
fc/firecracker.yml OpenAPI spec bumped to v1.14.1 with new endpoints and definitions
fc/models/* Regenerated go-swagger models for new Firecracker API
fc/client/operations/* Regenerated go-swagger client operations
sandbox/fc/client.go New enableFreePageReporting method
sandbox/fc/process.go Passes freePageReporting flag to process creation
sandbox/sandbox.go Config struct extended with FreePageReporting
api/.../sandbox_features.go HasFreePageReporting() version check
api/.../create_template.go Passes free-page-reporting flag when creating templates
feature-flags/flags.go New v1.14 Firecracker version constant
tests/integration/seed.go Seed data updated to v1.14.1
.github/actions/build-sandbox-template/action.yml CI updated to v1.14.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +288 to +326
u.wg.Go(func() error {
// The RLock must be called inside the goroutine to ensure RUnlock runs via defer,
// even if the errgroup is cancelled or the goroutine returns early.
// This check protects us against race condition between marking the request for prefetching and accessing the prefetchTracker.
u.settleRequests.RLock()
defer u.settleRequests.RUnlock()

var copyMode CULong
var accessType block.AccessType

// Performing copy() on UFFD clears the WP bit unless we explicitly tell
// it not to. We do that for faults caused by a read access. Write accesses
// would anyways cause clear the write-protection bit.
if pf.flags&UFFD_PAGEFAULT_FLAG_WRITE == 0 {
copyMode = UFFDIO_COPY_MODE_WP
accessType = block.Read
} else {
accessType = block.Write
}

handled, err := u.faultPage(
ctx,
addr,
offset,
source,
fdExit.SignalExit,
copyMode,
)
if err != nil {
return err
}

if handled {
u.pageTracker.setState(addr, addr+u.pageSize, faulted)
} else {
deferred.push(pf)
}

u.prefetchTracker.Add(offset, accessType)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race condition between the goroutine that calls u.pageTracker.setState(addr, addr+u.pageSize, faulted) (line 321) and the main event loop that reads u.pageTracker.get(addr) (line 276). The pageTracker itself is mutex-protected for individual operations, but between the time the main loop reads a page as unfaulted, dispatches a goroutine, and the goroutine eventually marks it faulted, a second pagefault for the same address can be dispatched and both goroutines can race to call faultPage for the same address. While EEXIST is handled (line 376), this scenario can also cause the deferred.push(pf) path to be taken by both, resulting in indefinite re-queueing. More critically, in the same goroutine, deferred.push(pf) is called (line 323) without holding the settleRequests lock, while the main loop drains deferred (line 235) without coordinating with in-flight goroutines, meaning a deferred fault could be drained and forgotten before the goroutine that pushed it is done executing.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure right now if this can be a problem, so we should recheck.

Copy link
Contributor Author

@bchalios bchalios Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time we see there are available events to read from UFFD, we wait for all Go routines in the workgroup to complete before we start handling the events of the outer iteration. That happens at line 186.

}

if !handled {
span.RecordError(fmt.Errorf("page already faulted"))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prefault function in prefault.go (line 37-39) records an error to the span when handled is false (EAGAIN case), treating it as an error event. However, when faultPage returns (false, nil), it means the write returned EAGAIN and should be retried — it is not an error condition. The span should not record this as an error; at most a span attribute or event noting the deferral would be appropriate.

Suggested change
span.RecordError(fmt.Errorf("page already faulted"))
span.AddEvent("prefault deferred: page already faulted or write returned EAGAIN")

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clean this up.

Comment on lines +122 to +124
pagefaults = append(pagefaults, (*UffdPagefault)(unsafe.Pointer(&arg[0])))
case UFFD_EVENT_REMOVE:
removes = append(removes, (*UffdRemove)(unsafe.Pointer(&arg[0])))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In readEvents, the buf slice is reused across loop iterations, but the parsed UffdRemove and UffdPagefault pointers (at lines 122-124) point directly into the buffer's underlying memory via unsafe.Pointer. Since buf is overwritten by each subsequent syscall.Read call, all previously appended pointers now alias the same buffer memory and will reflect the data of the last-read event rather than the event they were parsed from.

Each parsed struct must be copied into its own allocation (e.g. v := *(*UffdPagefault)(unsafe.Pointer(&arg[0])); pagefaults = append(pagefaults, &v)) so that every pointer in the returned slices points to independent memory. Without this fix, processing multiple events in a single readEvents call leads to all events appearing identical (the last-read event), causing incorrect page state tracking and potentially silent data corruption in the guest VM's memory.

Suggested change
pagefaults = append(pagefaults, (*UffdPagefault)(unsafe.Pointer(&arg[0])))
case UFFD_EVENT_REMOVE:
removes = append(removes, (*UffdRemove)(unsafe.Pointer(&arg[0])))
v := *(*UffdPagefault)(unsafe.Pointer(&arg[0]))
pagefaults = append(pagefaults, &v)
case UFFD_EVENT_REMOVE:
v := *(*UffdRemove)(unsafe.Pointer(&arg[0]))
removes = append(removes, &v)

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a vaild one. We only allocate buf once and then reuse it across iterations, where we take a pointer to its arg field. I think all events will be pointing to the same memory. The solution is to move the allocation inside the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's not. Indeed we're re-using the same buffer (buf) but the arg bit is something we're getting calling getMsgArg which returns a [24]byte value, which essentially copies the msg.arg field. It's not the same memory. I have validated this, by manually inspecting the addresses of the various variables when the loop handles more than one events.

Comment on lines +407 to +418
func (pt *pageTracker) faultedOffsets(ma *memory.Mapping) ([]uint64, error) {
pt.mu.RLock()
defer pt.mu.RUnlock()

offsets := make([]uint64, 0, len(pt.m))
for addr := range pt.m {
offset, err := ma.GetOffset(addr)
if err != nil {
return nil, fmt.Errorf("address %#x not in mapping: %w", addr, err)
}
offsets = append(offsets, uint64(offset))
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cross_process_helpers_test.go, the faultedOffsets helper iterates over all pages in pt.m regardless of their state, including pages in the removed state. This means removed pages will appear in the faulted offsets output, which may cause test assertions to match incorrectly. Only pages in the faulted state should be included in the faulted offsets list.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is ok, because for this test we want all pages that were faulted and to be removed it had to be faulted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, we might want to change this if we are adding a new test for the removed.

case UFFD_EVENT_PAGEFAULT:
pagefaults = append(pagefaults, (*UffdPagefault)(unsafe.Pointer(&arg[0])))
case UFFD_EVENT_REMOVE:
removes = append(removes, (*UffdRemove)(unsafe.Pointer(&arg[0])))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event pointers reference ephemeral stack memory

High Severity

readEvents appends pointers created from arg := getMsgArg(msg), but arg is a per-iteration local copy. The stored *UffdPagefault and *UffdRemove pointers outlive that stack value, so later reads can use corrupted event data or repeated last-event values.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already noted in another comment.

switch state := u.pageTracker.get(addr); state {
case faulted:
// TODO: Can we skip faulting pages that are already faulted? How does that play with prefaulting?
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated faults are skipped indefinitely

Medium Severity

When a page is marked faulted, later pagefault events for that same address are ignored with continue. If a mapped page becomes missing again (for example after reclaim/swap behavior), the new fault is never resolved, leaving the guest thread blocked on an unhandled userfault.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO there to resolve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually use swap on the host? If not, the logic here is correct. The only reason why a page would become missing is because we previously removed it, and we are tracking that. If we do enable swap this whole logic is wrong. Actually it already was wrong, even before adding support for UFFD_EVENT_REMOVE. Imagine a page has been written. This means its contents have changed comparing to what is in the snapshot file. If that page gets swapped out, we have no way of knowing what data we should provide for it. I'm not really sure how swapping would interplay with UFFD at all.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 06d14431fe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ValentaTomas
Copy link
Member

We should add a test that explicitly tests the remove (and faulted?) flows in the uffd.

case UFFD_EVENT_PAGEFAULT:
pagefaults = append(pagefaults, (*UffdPagefault)(unsafe.Pointer(&arg[0])))
case UFFD_EVENT_REMOVE:
removes = append(removes, (*UffdRemove)(unsafe.Pointer(&arg[0])))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uffd event pointers reference invalid stack memory

High Severity

readEvents stores pointers to arg, but arg is a per-iteration local copy ([24]byte). After the loop, those pointers no longer reliably reference valid event data. Serve then reads pf.flags, pf.address, and remove ranges from invalid memory, which can mis-handle faults or fail unpredictably under load.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure here, we should recheck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we move buf inside the loop then Go will understand that it escapes from the function's context and will just place it in heap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event if we don't move buf inside the loop, Go is able to track that the memory (of arg) escapes the function context and puts it in the heap.

span.RecordError(fmt.Errorf("page already faulted"))
}

return nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefault silently succeeds after deferred write failure

Medium Severity

faultPage now returns handled=false on unix.EAGAIN, but Prefault treats !handled as non-fatal and still returns nil. This allows prefault requests to report success even when no page was mapped, leaving memory unfaulted and shifting failures to later runtime faults.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is that we get EAGAIN because of a UFFD_EVENT_REMOVE landing (concurrently) in the queue. When we get an EAGAIN during pre-faulting we don't explicitly handle a page fault, so ignoring here should be fine. I'm assuming here that when pre-faulting happens, we don't run the normal handler loop, right? If that's not the case, we might get issues if we are trying to "prefault" a page and handle a page fault to the same page at the same time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do run the normal loop at the time of the prefaulting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've spent some time thinking this. I don't think there's a problem with running prefaulter and on-demand faulting loop concurrently (as long as the page state tracker is handled/coordinated correctly). I can elaborate offline if you want why that is the case.

In any case, it's fine returning nil here. If EAGAIN happens, we will just not pre-fault the page. There isn't any functional problem with it.

// would anyways cause clear the write-protection bit.
if pf.flags&UFFD_PAGEFAULT_FLAG_WRITE == 0 {
copyMode = UFFDIO_COPY_MODE_WP
accessType = block.Read
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong mode passed to zeropage ioctl

High Severity

Read faults set mode to UFFDIO_COPY_MODE_WP, and removed 4KB pages are handled via UFFDIO_ZEROPAGE. zero() forwards that mode directly to UFFDIO_ZEROPAGE, which does not use UFFDIO_COPY_MODE_WP. This can make zero-fill faults fail and break handling of UFFD_EVENT_REMOVE for standard pages.

Additional Locations (2)

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should recheck if the zero does not handle mode wp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that it does. I will re-check

deferred.push(pf)
}

u.prefetchTracker.Add(offset, accessType)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefetch tracker updated for deferred (unhandled) page faults

Medium Severity

u.prefetchTracker.Add(offset, accessType) is called unconditionally after faultPage, even when handled is false (i.e., the fault was deferred due to EAGAIN). This means the prefetch tracker records offsets for pages that haven't actually been served yet. When the deferred fault is later retried and succeeds, prefetchTracker.Add will be called a second time for the same offset. Previously, this call only happened after a successful copy. It likely needs to be inside the if handled block.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is valid; we should fix it.

// would anyways cause clear the write-protection bit.
if accessType != block.Write {
copyMode |= UFFDIO_COPY_MODE_WP
writeErr = u.fd.copy(addr, u.pageSize, b, mode)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nil source reaches default switch causing nil dereference

Medium Severity

In faultPage, the switch only handles source == nil for header.PageSize and header.HugepageSize. If source is nil (page was removed) and u.pageSize doesn't match either constant, execution falls through to the default case, which calls source.Slice(...) on a nil interface, causing a panic. The code assumes only two page sizes exist, but this isn't validated here.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The page should not be a different size, but maybe we can add check somewhere to prevent accidens.

offset,
directDataSource{data, int64(u.pageSize)},
nil,
// TODO: What mode should we pass here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchalios What should we pass here? Should we use mode WP if we are prefaulting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to pass COPY_WP? That's what we do with read page faults.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the other comment, the prefault can run at the same time as the normal serve loop, so we should sync about what needs to be done to make this correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, here we should pass COPY_MODE_WP. Otherwise we will clear the write protection bit and subsequent snapshots will potentially be bigger.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

deferred.push(pf)
}

u.prefetchTracker.Add(offset, accessType)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefetch tracker records offset for unhandled deferred faults

Medium Severity

When faultPage returns handled=false (the EAGAIN path), the pagefault is correctly pushed to deferred for retry, but u.prefetchTracker.Add(offset, accessType) is called unconditionally afterward. This records the offset in prefetch data even though the page was never actually served. When the deferred fault eventually succeeds on a later iteration, prefetchTracker.Add is called again, double-counting the offset. The prefetchTracker.Add call needs to be inside the if handled branch.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check this.

bchalios and others added 18 commits March 5, 2026 11:37
Bump the swagger file for Firecracker to v1.14 and regenerate
APIs/models.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Add support for Firecracker v1.14 and make it the default.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
As we're going to handle UFFD_EVENT_REMOVE events triggerred by
Firecracker, we need to keep track of the state of all the guest memory
pages.

Theis commit introduces 3 states:

* Unfaulted - A page that has not been faulted yet.
* Faulted   - A page that we have previously faulted in.
* Removed   - A page that we have received a remove event for and
              haven't faulted in since.

It also adds the necessary book keeping of page state in all the memory
regions of the guest, along with methods for retrieving and setting the
state of pages.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Import a few more bindings that we'll need for handling remove events.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Handle UFFD_EVENT_REMOVE events from the file descriptor. These events
are triggerred when Firecracker calls madvise() with MADV_DONTNEED on
some memory range that we are tracking. This Firecracker behaviour is
support with version 1.14.0 onwards using the free page reporting and
hinting features of balloon devices.

What this means for us is that, we need to track removed pages because
subsequent page faults need to be served with a zero page.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Balloon devices provide memory reclamation facilities through free page
reporting, as a more efficient mechanism (in terms of latency and CPU
time) than traditional ballooning.

Free page reporting instructs the guest to periodically report memory
that has been freed, so that we can reclaim it in the host side. It is
enabled before starting the sandbox and does not require any further
host-side orchestration.

Enable free page reporting for all new templates using Firecracker
versions >= v1.14.0. Also, allow users to optionally disable it for
these versions. Older Firecracker versions don't support the feature.
Trying to enable it for those, will return an error.

Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
Signed-off-by: Babis Chalios <babis.chalios@e2b.dev>
@bchalios bchalios force-pushed the feat/free-page-reporting branch from 10a0cfd to 5fa75fe Compare March 6, 2026 19:11
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: Nil source dereference when zero-filling non-standard page sizes
    • faultPage now handles all source == nil page sizes by zero-filling in a dedicated branch, preventing the nil dereference and silent defer loop.
  • ✅ Fixed: Test helper faultedOffsets returns all states, not just faulted
    • The test helper now filters pageTracker entries to include only pages in the faulted state before converting addresses to offsets.
  • ✅ Fixed: Prefault silently succeeds on EAGAIN without mapping page
    • Prefault now returns an error when faultPage reports handled == false, so deferred EAGAIN cases are no longer reported as successful prefaults.

Create PR

Or push these changes by commenting:

@cursor push b8ac3ea454
Preview (b8ac3ea454)
diff --git a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/cross_process_helpers_test.go b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/cross_process_helpers_test.go
--- a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/cross_process_helpers_test.go
+++ b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/cross_process_helpers_test.go
@@ -409,7 +409,11 @@
 	defer pt.mu.RUnlock()
 
 	offsets := make([]uint64, 0, len(pt.m))
-	for addr := range pt.m {
+	for addr, state := range pt.m {
+		if state != faulted {
+			continue
+		}
+
 		offset, err := ma.GetOffset(addr)
 		if err != nil {
 			return nil, fmt.Errorf("address %#x not in mapping: %w", addr, err)

diff --git a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/prefault.go b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/prefault.go
--- a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/prefault.go
+++ b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/prefault.go
@@ -35,7 +35,10 @@
 	}
 
 	if !handled {
-		span.RecordError(fmt.Errorf("page already faulted"))
+		err = fmt.Errorf("page fault was deferred and not mapped")
+		span.RecordError(err)
+
+		return err
 	}
 
 	return nil

diff --git a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/userfaultfd.go b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/userfaultfd.go
--- a/packages/orchestrator/internal/sandbox/uffd/userfaultfd/userfaultfd.go
+++ b/packages/orchestrator/internal/sandbox/uffd/userfaultfd/userfaultfd.go
@@ -360,6 +360,8 @@
 		writeErr = u.fd.zero(addr, u.pageSize, mode)
 	case source == nil && u.pageSize == header.HugepageSize:
 		writeErr = u.fd.copy(addr, u.pageSize, header.EmptyHugePage, mode)
+	case source == nil:
+		writeErr = u.fd.zero(addr, u.pageSize, mode)
 	default:
 		b, dataErr := source.Slice(ctx, offset, int64(u.pageSize))
 		if dataErr != nil {
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

// would anyways cause clear the write-protection bit.
if accessType != block.Write {
copyMode |= UFFDIO_COPY_MODE_WP
writeErr = u.fd.copy(addr, u.pageSize, b, mode)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nil source dereference when zero-filling non-standard page sizes

Medium Severity

In faultPage, when source is nil (removed page needing zero-fill), only 4KB (header.PageSize) and 2MB (header.HugepageSize) are handled explicitly. The default branch calls source.Slice() which would panic on a nil source. The panic recovery silently swallows the error, returning (false, nil), which causes the fault to be endlessly deferred via deferred.push(pf), permanently blocking the faulting thread.

Fix in Cursor Fix in Web

slices.Sort(offsets)
}

return offsets, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test helper faultedOffsets returns all states, not just faulted

Low Severity

faultedOffsets iterates over all entries in pageTracker.m with for addr := range pt.m without filtering by pageState. Pages in removed state are included in the results alongside genuinely faulted pages. This would cause tests exercising the new remove event flow to report incorrect faulted offsets.

Fix in Cursor Fix in Web

span.RecordError(fmt.Errorf("page already faulted"))
}

return nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefault silently succeeds on EAGAIN without mapping page

Medium Severity

When faultPage returns (false, nil) due to EAGAIN (concurrent remove event), Prefault treats it as a non-error by recording "page already faulted" and returning nil. The page is not actually mapped, but the caller believes prefaulting succeeded. Unlike the Serve loop which retries via deferred.push, Prefault has no retry mechanism for EAGAIN.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants