Skip to content

feat(nimble): Plumb FileIoContext through TabletReader direct preadv calls#684

Open
mkarrmann wants to merge 2 commits into
facebookincubator:mainfrom
mkarrmann:export-D103114180
Open

feat(nimble): Plumb FileIoContext through TabletReader direct preadv calls#684
mkarrmann wants to merge 2 commits into
facebookincubator:mainfrom
mkarrmann:export-D103114180

Conversation

@mkarrmann
Copy link
Copy Markdown

@mkarrmann mkarrmann commented May 1, 2026

Summary:
TabletReader performs file IO outside the normal BufferedInput read path: postscript/footer bootstrap reads, coalesced metadata loads through MetadataInput, index metadata/data reads, and stripe data loads. Those reads can reach ReadFile implementations such as WSReadFile, which record filesystem runtime stats from FileIoContext::ioStats. When TabletReader and its helper readers use the default FileIoContext{}, those filesystem stats are dropped even though the original BufferedInput has the operator's context.

This change carries the BufferedInput's FileIoContext through the current TabletReader architecture:

  • TabletReader::configureOptions(options, bufferedInput) extracts bufferedInput->getInputStream()->fileIoContext() when a BufferedInput is supplied.
  • TabletReader::Options and TabletReader store that FileIoContext.
  • Footer/postscript rawRead(...) calls pass the context to ReadFile::pread(...).
  • MetadataInput stores the context and forwards it to coalesced ReadFile::preadv(...) calls.
  • IndexLookup::Options carries the same context, and index metadata/data input construction uses it for direct metadata reads and file read options.
  • Stripe-data loads continue to pass the same context to direct file_->preadv(...) calls.
  • The selective Nimble reader path (ReaderBase::create) now passes its BufferedInput into configureOptions so the context is wired up automatically.
  • The TrackingReadFile test helper now increments trackingReadFile.pread/trackingReadFile.preadv counters on the supplied FileIoContext::ioStats, enabling the new tests to assert that the context reaches the underlying ReadFile.

Depends on the prior commit (feat(dwio): Add public fileIoContext() accessor to ReadFileInputStream) for exposing the FileIoContext stored inside ReadFileInputStream.

Compatibility:

  • Source-compatible: all new parameters default to FileIoContext{}/nullptr where existing callers do not have a BufferedInput.
  • Behavior-compatible for standard ReadFile implementations that ignore FileIoContext::ioStats.
  • Implementations that consume FileIoContext::ioStats, such as WSReadFile, now see the caller's filesystem stats context for TabletReader metadata and data reads.

Differential Revision: D103114180

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 1, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 1, 2026

@mkarrmann has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103114180.

@meta-codesync meta-codesync Bot changed the title Plumb FileIoContext through direct file_->preadv calls Plumb FileIoContext through direct file_->preadv calls (#684) May 18, 2026
@mkarrmann mkarrmann force-pushed the export-D103114180 branch from 0e76cb2 to 8232ad8 Compare May 18, 2026 22:28
mkarrmann added a commit to mkarrmann/nimble that referenced this pull request May 18, 2026
…ator#684)

Summary:

`FileIOContext` is used for instrumentation and setting config on filesystem calls.

Problem: currently, `TabletReader`'s direct `preadv` calls don't include the `FileIOContext`. Internally at Meta, this means their stats are not included  for certain monitoring.

Solution: Support directly passing a `FileIOContext`, which is used for all direct filesystem calls.

## Alternatives Considered

Instead of adding `FileIoContext` as a parameter to `configureOptions`, we could instead reuse the `FileIOContext` included in the `BufferedInput`. (This is what Claude originally wanted to do :))

I opted to decouple the `FileIOContext` for the direct filesystem calls from `BufferedInput`.

Differential Revision: D103114180
mkarrmann added a commit to mkarrmann/velox that referenced this pull request May 18, 2026
Summary:

Publicly expose `InputStream`'s `FileIOContext`.

## Motivation

The Nimble TabletReader bypasses `BufferedInput` at times, making direct filesystem calls. At Meta, we wish to reuse the same FileIOContext for the direct filesystem calls, for instrumentation (see facebookincubator/nimble#684 for context on Nimble side).

Velox Reader Factories take only a `BufferedInput`, not `FileIOContext`, for making filesystem calls. So, in Meta's internal Nimble reader factory, I need to directly access the `FileIOContext` from the `BufferedInput`.

Alternatively, we could refactor all Reader Factories to accept a `FileIOContext`. However, that change would have much larger scope. This change feels reasonable.

Reviewed By: xiaoxmeng

Differential Revision: D103114177
@meta-codesync meta-codesync Bot changed the title Plumb FileIoContext through direct file_->preadv calls (#684) Plumb FileIoContext through direct file_->preadv calls May 27, 2026
@mkarrmann mkarrmann force-pushed the export-D103114180 branch from 8232ad8 to b66d629 Compare May 27, 2026 05:33
…calls

Summary:
TabletReader performs file IO outside the normal BufferedInput read path: postscript/footer bootstrap reads, coalesced metadata loads through MetadataInput, index metadata/data reads, and stripe data loads. Those reads can reach ReadFile implementations such as WSReadFile, which record filesystem runtime stats from FileIoContext::ioStats. When TabletReader and its helper readers use the default FileIoContext{}, those filesystem stats are dropped even though the original BufferedInput has the operator's context.

This change carries the BufferedInput's FileIoContext through the current TabletReader architecture:

- `TabletReader::configureOptions(options, bufferedInput)` extracts `bufferedInput->getInputStream()->fileIoContext()` when a BufferedInput is supplied.
- `TabletReader::Options` and `TabletReader` store that FileIoContext.
- Footer/postscript `rawRead(...)` calls pass the context to `ReadFile::pread(...)`.
- `MetadataInput` stores the context and forwards it to coalesced `ReadFile::preadv(...)` calls.
- `IndexLookup::Options` carries the same context, and index metadata/data input construction uses it for direct metadata reads and file read options.
- Stripe-data loads continue to pass the same context to direct `file_->preadv(...)` calls.
- The selective Nimble reader path (`ReaderBase::create`) now passes its BufferedInput into `configureOptions` so the context is wired up automatically.
- The `TrackingReadFile` test helper now increments `trackingReadFile.pread`/`trackingReadFile.preadv` counters on the supplied `FileIoContext::ioStats`, enabling the new tests to assert that the context reaches the underlying ReadFile.

Depends on the prior commit (feat(dwio): Add public fileIoContext() accessor to ReadFileInputStream) for exposing the FileIoContext stored inside ReadFileInputStream.

Compatibility:
- Source-compatible: all new parameters default to `FileIoContext{}`/nullptr where existing callers do not have a BufferedInput.
- Behavior-compatible for standard ReadFile implementations that ignore FileIoContext::ioStats.
- Implementations that consume FileIoContext::ioStats, such as WSReadFile, now see the caller's filesystem stats context for TabletReader metadata and data reads.

Differential Revision: D103114180
@meta-codesync meta-codesync Bot changed the title Plumb FileIoContext through direct file_->preadv calls feat(nimble): Plumb FileIoContext through TabletReader direct preadv calls May 27, 2026
@mkarrmann mkarrmann force-pushed the export-D103114180 branch from b66d629 to 1b1194e Compare May 27, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant