feat(nimble): Plumb FileIoContext through TabletReader direct preadv calls#684
Open
mkarrmann wants to merge 2 commits into
Open
feat(nimble): Plumb FileIoContext through TabletReader direct preadv calls#684mkarrmann wants to merge 2 commits into
mkarrmann wants to merge 2 commits into
Conversation
|
@mkarrmann has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103114180. |
0e76cb2 to
8232ad8
Compare
mkarrmann
added a commit
to mkarrmann/nimble
that referenced
this pull request
May 18, 2026
…ator#684) Summary: `FileIOContext` is used for instrumentation and setting config on filesystem calls. Problem: currently, `TabletReader`'s direct `preadv` calls don't include the `FileIOContext`. Internally at Meta, this means their stats are not included for certain monitoring. Solution: Support directly passing a `FileIOContext`, which is used for all direct filesystem calls. ## Alternatives Considered Instead of adding `FileIoContext` as a parameter to `configureOptions`, we could instead reuse the `FileIOContext` included in the `BufferedInput`. (This is what Claude originally wanted to do :)) I opted to decouple the `FileIOContext` for the direct filesystem calls from `BufferedInput`. Differential Revision: D103114180
mkarrmann
added a commit
to mkarrmann/velox
that referenced
this pull request
May 18, 2026
Summary: Publicly expose `InputStream`'s `FileIOContext`. ## Motivation The Nimble TabletReader bypasses `BufferedInput` at times, making direct filesystem calls. At Meta, we wish to reuse the same FileIOContext for the direct filesystem calls, for instrumentation (see facebookincubator/nimble#684 for context on Nimble side). Velox Reader Factories take only a `BufferedInput`, not `FileIOContext`, for making filesystem calls. So, in Meta's internal Nimble reader factory, I need to directly access the `FileIOContext` from the `BufferedInput`. Alternatively, we could refactor all Reader Factories to accept a `FileIOContext`. However, that change would have much larger scope. This change feels reasonable. Reviewed By: xiaoxmeng Differential Revision: D103114177
8232ad8 to
b66d629
Compare
…calls
Summary:
TabletReader performs file IO outside the normal BufferedInput read path: postscript/footer bootstrap reads, coalesced metadata loads through MetadataInput, index metadata/data reads, and stripe data loads. Those reads can reach ReadFile implementations such as WSReadFile, which record filesystem runtime stats from FileIoContext::ioStats. When TabletReader and its helper readers use the default FileIoContext{}, those filesystem stats are dropped even though the original BufferedInput has the operator's context.
This change carries the BufferedInput's FileIoContext through the current TabletReader architecture:
- `TabletReader::configureOptions(options, bufferedInput)` extracts `bufferedInput->getInputStream()->fileIoContext()` when a BufferedInput is supplied.
- `TabletReader::Options` and `TabletReader` store that FileIoContext.
- Footer/postscript `rawRead(...)` calls pass the context to `ReadFile::pread(...)`.
- `MetadataInput` stores the context and forwards it to coalesced `ReadFile::preadv(...)` calls.
- `IndexLookup::Options` carries the same context, and index metadata/data input construction uses it for direct metadata reads and file read options.
- Stripe-data loads continue to pass the same context to direct `file_->preadv(...)` calls.
- The selective Nimble reader path (`ReaderBase::create`) now passes its BufferedInput into `configureOptions` so the context is wired up automatically.
- The `TrackingReadFile` test helper now increments `trackingReadFile.pread`/`trackingReadFile.preadv` counters on the supplied `FileIoContext::ioStats`, enabling the new tests to assert that the context reaches the underlying ReadFile.
Depends on the prior commit (feat(dwio): Add public fileIoContext() accessor to ReadFileInputStream) for exposing the FileIoContext stored inside ReadFileInputStream.
Compatibility:
- Source-compatible: all new parameters default to `FileIoContext{}`/nullptr where existing callers do not have a BufferedInput.
- Behavior-compatible for standard ReadFile implementations that ignore FileIoContext::ioStats.
- Implementations that consume FileIoContext::ioStats, such as WSReadFile, now see the caller's filesystem stats context for TabletReader metadata and data reads.
Differential Revision: D103114180
b66d629 to
1b1194e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
TabletReader performs file IO outside the normal BufferedInput read path: postscript/footer bootstrap reads, coalesced metadata loads through MetadataInput, index metadata/data reads, and stripe data loads. Those reads can reach ReadFile implementations such as WSReadFile, which record filesystem runtime stats from FileIoContext::ioStats. When TabletReader and its helper readers use the default FileIoContext{}, those filesystem stats are dropped even though the original BufferedInput has the operator's context.
This change carries the BufferedInput's FileIoContext through the current TabletReader architecture:
TabletReader::configureOptions(options, bufferedInput)extractsbufferedInput->getInputStream()->fileIoContext()when a BufferedInput is supplied.TabletReader::OptionsandTabletReaderstore that FileIoContext.rawRead(...)calls pass the context toReadFile::pread(...).MetadataInputstores the context and forwards it to coalescedReadFile::preadv(...)calls.IndexLookup::Optionscarries the same context, and index metadata/data input construction uses it for direct metadata reads and file read options.file_->preadv(...)calls.ReaderBase::create) now passes its BufferedInput intoconfigureOptionsso the context is wired up automatically.TrackingReadFiletest helper now incrementstrackingReadFile.pread/trackingReadFile.preadvcounters on the suppliedFileIoContext::ioStats, enabling the new tests to assert that the context reaches the underlying ReadFile.Depends on the prior commit (feat(dwio): Add public fileIoContext() accessor to ReadFileInputStream) for exposing the FileIoContext stored inside ReadFileInputStream.
Compatibility:
FileIoContext{}/nullptr where existing callers do not have a BufferedInput.Differential Revision: D103114180