Skip to content

QueryDir optimization v1#22

Open
sprasad-microsoft wants to merge 19 commits intomasterfrom
personal/sprasad/for-next
Open

QueryDir optimization v1#22
sprasad-microsoft wants to merge 19 commits intomasterfrom
personal/sprasad/for-next

Conversation

@sprasad-microsoft
Copy link
Copy Markdown
Owner

No description provided.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an optimized SMB2 QueryDirectory path aimed at improving readdir performance, including multi-credit/large-buffer support, better handling of encrypted large responses, and updates to directory entry caching and buffer lifetime management.

Changes:

  • Add async, multi-credit QueryDirectory receive/callback path and a synchronous wrapper that can request large output buffers.
  • Extend SMB2 QueryDir “first” compound sequence to issue a second minimal QueryDirectory to detect/append additional entries.
  • Refactor cached directory-entry emission/caching helpers and introduce a new “dynamic buffer” tracking mode for correct freeing.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
fs/smb/client/transport.c Adds QueryDirectory async receive/discard logic and builds a combined response buffer.
fs/smb/client/trace.h Adds a new credit trace tag for query-dir completion.
fs/smb/client/smb2proto.h Extends QueryDirectory init API and declares new handlers.
fs/smb/client/smb2pdu.h Adds QueryDir sizing constants/macros and max buffer sizing.
fs/smb/client/smb2pdu.c Implements large-buffer QueryDirectory helper and updates QueryDirectory init/free/parse paths.
fs/smb/client/smb2ops.c Implements QueryDir callback/data handling, adds encrypted QueryDir support, updates query_dir_first/next logic.
fs/smb/client/smb2misc.c Allows larger QueryDirectory data-area lengths in size validation.
fs/smb/client/smb1ops.c Updates query_dir_next signature to match ops interface change.
fs/smb/client/readdir.c Updates query_dir_next invocation and refactors cached-dir emission/caching calls.
fs/smb/client/inode.c Adds cached-dir invalidation hooks on unlink/rmdir/rename paths.
fs/smb/client/file.c Updates closedir buffer free logic for dynamic buffers.
fs/smb/client/cifsproto.h Exposes the new QueryDirectory receive handler prototype.
fs/smb/client/cifsglob.h Extends query_dir_next ops signature; adds dynamic buffer flag and QueryDir IO context struct.
fs/smb/client/cached_dir.h Exposes cached-dir helper APIs moved out of readdir.c.
fs/smb/client/cached_dir.c Implements cached-dir helper APIs and adds additional cached-dir open validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fs/smb/client/transport.c Outdated
struct cifs_query_dir_io *qd_io = mid->callback_data;

length = cifs_discard_remaining_data(server);
dequeue_mid(server, mid, qd_io->result != 0);
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cifs_query_dir_discard() marks the mid as malformed whenever qd_io->result != 0. For QueryDirectory, STATUS_NO_MORE_FILES maps to -ENODATA (see SMB2_query_directory() handling in smb2pdu.c) and should be treated as a normal response (MID_RESPONSE_RECEIVED), not MID_RESPONSE_MALFORMED. As-is, end-of-directory will be converted into a malformed mid state and the callback will overwrite the -ENODATA with an EIO-style error. Consider passing a separate 'malformed' flag (true only for framing/size/offset corruption) and treating protocol status errors (including -ENODATA) as non-malformed.

Suggested change
dequeue_mid(server, mid, qd_io->result != 0);
/*
* qd_io->result carries QueryDirectory completion status and must not
* be used to infer transport/frame corruption. Normal protocol status
* values such as -ENODATA (end of directory) should leave the MID in a
* non-malformed state.
*/
dequeue_mid(server, mid, false);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread fs/smb/client/transport.c Outdated
dequeue_mid(server, mid, false);
mid->resp_buf = server->smallbuf;
server->smallbuf = NULL;
return length;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cifs_query_dir_receive() can return a negative value after it has already dequeued the mid (e.g., via discard_and_queue or cifs_query_dir_discard). In cifs_demultiplex_thread() a negative return causes the callback to be skipped, which means SMB2_query_directory_large() (waiting on qd_io->done) can hang indefinitely. After dequeuing the mid, ensure the receive handler returns >= 0 so the callback runs and completion is signaled (or signal completion directly before returning).

Suggested change
return length;
/*
* After dequeuing the mid, return a non-negative value so the
* demultiplex thread still runs the callback. Any operation error
* has already been recorded in qd_io->result for the callback path.
*/
return length < 0 ? 0 : length;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread fs/smb/client/transport.c
Comment on lines +1289 to +1321
/*
* Build combined header+data in qd_io->combined_iov.iov_base.
* Layout: [header][data] with no padding between them.
* We update OutputBufferOffset in the header to reflect the new layout.
*/
if (qd_io->iov[0].iov_len > 0 && qd_io->result == 0) {
size_t hdr_len = qd_io->iov[0].iov_len;
struct smb2_query_directory_rsp *combined_rsp;

/* Copy header to beginning of combined buffer */
memcpy(qd_io->combined_iov.iov_base, qd_io->iov[0].iov_base, hdr_len);

/* Read directory entries directly after the header, removing any padding */
if (data_len > 0) {
length = cifs_read_from_socket(server,
qd_io->combined_iov.iov_base + hdr_len,
data_len);
if (length < 0) {
qd_io->result = length;
goto discard_and_queue;
}
server->total_read += length;

/* Update OutputBufferOffset and OutputBufferLength to reflect
* the new compact layout (header followed immediately by data).
*/
combined_rsp = (struct smb2_query_directory_rsp *)qd_io->combined_iov.iov_base;
combined_rsp->OutputBufferOffset = cpu_to_le16(hdr_len);
combined_rsp->OutputBufferLength = cpu_to_le32(length);

/* Set up second iov pointing to the directory data within combined buffer */
qd_io->iov[1].iov_base = qd_io->combined_iov.iov_base + hdr_len;
qd_io->iov[1].iov_len = length;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The receive handler compacts the response by removing padding and then mutates OutputBufferOffset/OutputBufferLength in the copied header. This makes it impossible to verify SMB2 signing against the exact on-the-wire bytes, and also makes behavior diverge from the encrypted path which preserves the wire layout. Consider preserving the wire layout in the combined buffer (copy header+padding up to data_offset, read data at data_offset) and avoid modifying the header until after signature verification (or not at all).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread fs/smb/client/smb2pdu.c
Comment on lines +5525 to +5534
if (server->sign && !mid->decrypted) {
int rc;
struct smb_rqst rqst = { .rq_iov = &qd_io->iov[0], .rq_nvec = 1 };

rc = smb2_verify_signature(&rqst, server);
if (rc) {
cifs_tcon_dbg(VFS, "QueryDir signature verification returned error = %d\n",
rc);
qd_io->result = rc;
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smb2_query_dir_callback() verifies the signature using only qd_io->iov[0] with rq_nvec=1. For QueryDirectory responses with a data payload, SMB2 signing covers the full response (header + any padding + data), so verifying only the header will either fail incorrectly or provide incomplete integrity checking. Build the smb_rqst to cover the entire response (e.g., use both qd_io->iov[0..1] with rq_nvec=2, or an rq_iter like the READ path) and ensure those buffers reflect the original wire layout (including padding).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread fs/smb/client/smb2pdu.c
Comment on lines 5693 to 5740
@@ -5551,12 +5736,6 @@ int SMB2_query_directory_init(const unsigned int xid,
req->FileNameOffset =
cpu_to_le16(sizeof(struct smb2_query_directory_req));
req->FileNameLength = cpu_to_le16(len);
/*
* BB could be 30 bytes or so longer if we used SMB2 specific
* buffer lengths, but this is safe and close enough.
*/
output_size = min_t(unsigned int, output_size, server->maxBuf);
output_size = min_t(unsigned int, output_size, 2 << 15);
req->OutputBufferLength = cpu_to_le32(output_size);

Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMB2_query_directory_init() no longer caps OutputBufferLength to server->maxBuf and the historical 64KiB limit (2<<15). With the new callers passing large output_size values, this can exceed what the server negotiated/supports and may lead to STATUS_INVALID_PARAMETER or other failures on servers that don't accept large QueryDirectory buffers. Consider restoring a cap (at least min(output_size, server->maxBuf)), and ensure credit charging/output sizing stays consistent with the negotiated maximums.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already handled in the callers. The cap uses the negotiated rsize.

Comment thread fs/smb/client/inode.c Outdated
if (!open_cached_dir_by_dentry(tcon, parent, &parent_cfid)) {
mutex_lock(&parent_cfid->dirents.de_mutex);
parent_cfid->dirents.is_valid = false;
parent_cfid->dirents.is_failed = true;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cifs_invalidate_cached_dir() sets parent_cfid->dirents.is_failed = true. In cached_dir.c, is_failed permanently disables caching for that directory (add_cached_dirent/update_cached_dirents_count/finished_cached_dirents_count all bail out when is_failed is set), and there is no reset path. If the intent is to invalidate and allow the cache to be rebuilt on the next scan, this should likely clear cached entries and set is_valid=false while leaving is_failed=false (or explicitly reset is_failed to false).

Suggested change
parent_cfid->dirents.is_failed = true;
parent_cfid->dirents.is_failed = false;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread fs/smb/client/smb2ops.c

cifs_dbg(FYI, "%s: processing encrypted QueryDirectory response\n", __func__);

if (shdr->Command != SMB2_QUERY_DIRECTORY) {
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_query_dir_data() compares shdr->Command directly against SMB2_QUERY_DIRECTORY, but shdr->Command is little-endian on the wire (__le16). This will break on big-endian architectures. Use le16_to_cpu(shdr->Command) for the comparison (similar to the command checks added in receive_encrypted_read/smb2_decrypt_offload).

Suggested change
if (shdr->Command != SMB2_QUERY_DIRECTORY) {
if (le16_to_cpu(shdr->Command) != SMB2_QUERY_DIRECTORY) {

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a problem as SMB2_QUERY_DIRECTORY is of type le16

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fs/smb/client/dir.c Outdated
Comment on lines +735 to +739
if (pTcon->nocase && !open_cached_dir_by_dentry(pTcon, direntry->d_parent, &cfid)) {
struct qstr qname = QSTR_INIT(direntry->d_name.name, direntry->d_name.len);
int rc_wait;

rc_wait = cifs_wait_for_pending_dcache(cfid, qname.name, qname.len);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this new fast-path for negative dentries, cfid is used to decide whether the parent directory is fully cached. The subsequent check of cfid->dirents.is_valid should be done under cfid->dirents.de_mutex (or via READ_ONCE with a clear lockless contract), since is_valid is updated under de_mutex during cache population/invalidation. Otherwise this can race and incorrectly skip the on-the-wire lookup.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread fs/smb/client/cifsfs.c
Comment on lines +173 to +190
static unsigned long cifs_dircache_shrinker_scan(struct shrinker *shrink,
struct shrink_control *sc)
{
unsigned long freed_bytes;

(void)shrink;

if (!sc->nr_to_scan)
return 0;

if (!atomic64_read(&cifs_dircache_bytes_used))
return SHRINK_STOP;

freed_bytes = cifs_drop_all_dir_caches();
if (!freed_bytes)
return SHRINK_STOP;

return DIV_ROUND_UP_ULL(freed_bytes, PAGE_SIZE);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cifs_dircache_shrinker_scan() calls cifs_drop_all_dir_caches(), which flushes cfid_put_wq via flush_workqueue(). Shrinker scan callbacks can run in memory reclaim paths, and synchronously flushing a workqueue here can cause long stalls and risks reclaim/worker deadlocks (especially if the work items themselves need memory). Consider making the shrinker scan non-blocking (e.g., queue invalidation and return, or reclaim a bounded amount of cache without waiting on workqueues).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread fs/smb/client/smb2ops.c Outdated
Comment on lines +2525 to +2529
rc = SMB2_query_directory_init(xid, tcon, server,
&rqst[1],
COMPOUND_FID, COMPOUND_FID,
0, srch_inf->info_level);
0, srch_inf->info_level,
SMB2_QD1_OUTPUT_SIZE(CIFSMaxBufSize));
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Create+QD1+QD2 compound path, QD1’s OutputBufferLength is computed from CIFSMaxBufSize (via SMB2_QD1_OUTPUT_SIZE) and passed into SMB2_query_directory_init(). Since SMB2_query_directory_init() no longer caps OutputBufferLength (and this compound path does not set CreditCharge / wait for multi-credit), a larger CIFSMaxBufSize (module param allows >64KiB) can produce an OutputBufferLength > SMB2_MAX_BUFFER_SIZE (64KiB) and may be rejected by servers due to insufficient CreditCharge. Consider clamping the QD1 output size to SMB2_MAX_BUFFER_SIZE (or server->maxBuf) for this compound, or explicitly set CreditCharge and ensure the compound credit reservation accounts for it.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Today we skip calling change_conf for negotiates and session setup
requests. This can be a problem for mchan as the immediate next call
after session setup could be due to an I/O that is made on the
mount point. For single channel, this is not a problem as
there will be several calls after setting up session.

This change enforces calling change_conf when the total credits contain
enough for reservations for echoes and oplocks. We expect this to happen
during the last session setup response. This way, echoes and oplocks are
not disabled before the first request to the server. So if that first
request is an open, it does not need to disable requesting leases.

Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
It is possible that SMB2_open_init may not set lease context based
on the requested oplock level. This can happen when leases have been
temporarily or permanently disabled. When this happens, we will have
open_cached_dir making an open without lease context and the response
will anyway be rejected by open_cached_dir (thereby forcing a close to
discard this open). That's unnecessary two round-trips to the server.

This change adds a check before making the open request to the server
to make sure that SMB2_open_init did add the expected lease context
to the open in open_cached_dir.

Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
@sprasad-microsoft sprasad-microsoft force-pushed the personal/sprasad/for-next branch from 9ef390d to 889e9ac Compare April 27, 2026 10:10
Today we do not invalidate the cached_dirent or the entire
parent cfid when a dentry in a dir has been removed/moved.

This change invalidates the parent cfid so that we don't serve
directory contents from the cache.

Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
QueryDirectory responses today are stored in one of two fixed
sized buffers: smallbuf (448 bytes) or bigbuf (16KB). These are
borrowed from server struct and are not sufficient for large-sized
query dir operations.

With this change we will now define a new buffer type specifically
for cifs_search_info to hold variable sized responses. These will
be allocated by kmalloc and freed by kfree.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
For small directories (where the entire directory contents could be
read in a single QueryDir request), we currently do an extra
round-trip just to get a STATUS_NO_MORE_FILES back from the server.

This change avoids doing that by adding another QueryDir to the first
compound to the server for readdir. i.e. first request to readdir
will correspond to a compound of (OPEN+QD1+QD2). QD2 will request
for a smaller size (in anticipation of STATUS_NO_MORE_FILES).

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today QueryDirectory uses the compound_send_recv infrastructure
which is limited to 16KB in size. As a result, readdir of large
directories generally take several round trips.

With this change, if the readdir needs a QueryDir after the first
round-trip (meaning that there are more dirents to read), then the
following QueryDirs will now switch to using larger buffers with
MTU credits.

Till now, the only command type that used this flow was SMB2_READ.
In case of encrypted response, it becomes challenging to decide if
the response is for SMB2_READ or SMB2_READDIR. This change reuses
receive_encrypted_read and after decrypting the response decides
the handling function based on the command in the resp header. That
way, care has been taken to ensure that the read code path
modifications on account of this change are kept to a minimum.

The change also renames the discard function since both read and
query_dir paths will now reuse the same function to discard remaining
data in the socket and dequeue the mid.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Currently, we have helper functions for cfid and dirent caching spread
across cached_dir.c and readdir.c, with de_mutex locking done inside
the calling functions. This change neatly wraps them in helper functions
and keeps all such functions in cached_dir.c.

This code also splits the logic of dirent emit into cache and to VFS into
different functions.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today all synchronization of cfid related data structures are done
using cfid_list_lock. This can serialize caching of different dirs
unnecessarily.

This change introduces two new locks to provide finer locking.
Every cfid will now have a cfid_lock. This is designed to protect
everything inside a cfid that is not related to list operations.

Every cfid will now also have a cfid_open_mutex. This is designed to
protect parallel open calls to the same dir.

Additionally, this change will now make accesses to cfid->dentries
more stringent using the de_mutex.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
When a cached_dirents population is underway but not yet fully populated,
cifs_readdir does not rely on the local cache, but makes a parallel
stream of QueryDir calls to the server. However, these calls are made
without lease key and that ends up breaking our dir lease.

This change will reuse the existing lease key for this scenario for the
parallel QueryDir calls that are made.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today the cached_dirents is a linked list with one entry per dentry.
This is very inefficient from the point of view of memory allocation
and memory management.

This change introduces a hybrid structure. cached_dirents will start
with maintaining a linked list for cached_dirents for small directories.
When the size of the directory (in terms of number of dirents) exceeds a
threshold (64), cached_dirents will now switch over to a folioq structure
to store the cached_dirents.

The idea is to reduce the number of memory allocations significantly for
large directories. Additionally, this change also tries to store short names
(names less than 64 bytes) in the folio itself, further reducing the
memory allocation calls. If the namelen is greater than 64 bytes or
if the folio does not have space to store more names, it falls back to kmalloc.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
When a directory lease is held, we do not need to invalidate the
dirent cache on the cfid when new dentries are added.

This change allows local adds to directory contents to be made
in the cached_dirents so that we don't need to fetch again from server.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Since the cached_dirents are now backed by folioq, we do not need
a timed cleanup of these cfids anymore. This change registers a
shrinker with the mm layer for the dircache. If mm needs to free up
memory or flush the cache, it can inform cifs about this using the
shrinker interface.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today there is no way to disable time-based eviction of dir cache.
dir_cache_timeout = 0 meant immediate free up of dir cache on next
laundromat scan. We already have nohandlecache to disable dir cache.

This changes the meaning of dir_cache_timeout = 0 to mean unlimited
timeout. Shrinker-based eviction is still possible.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today we can control the number of cached dirs on the system using
the mount option max_cached_dirs. The default value is 16.

This change allows setting this option to a special value of 0,
which means unlimited number of cached dirs. Shrinker can still
evict the cached dirs.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today cifs_readdir populates the dcache inline whenever it
emits dentries. This introduces a perf bottleneck in
readdir performance.

This change attempts to make dcache population asynchronous
and parallel by doing this inside workers. It introduces a new
flag that lets reval and lookup wait for dentry population instead.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
This change introduces ftrace tracepoints for cached_dir operations.
Intended to be used for debugging cached_dir.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
If discard functions for readv and query_dir return error, the
callback functions can be skipped. This can end up with hung
syscalls due to the completion functions not getting called.

This change ensures that both these discard functions call
the callback function when discard from socket returned error.
This ensures that at least after the mid for the response is found,
the callback doesn't get skipped, and we do not leave syscalls
waiting.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Today tcon->cfids are maintained as a linked list. When we do
not limit the number of cfids (limited by system memory), this
can be a problem if the mount has a large number of active dir
accesses. We soon start hitting perf bottlenecks.

This change stores cfids in a rbtree instead of a linked list.
The nodes of this tree are keyed based on the path inside the tcon.

Additionally, for faster lookups by dentry, introducing a hashtable
to allow lookups of cfids by dentry faster. This is important
since open_cached_dir_by_dentry is called quite frequently.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
In order to make sure that parallel readdirs do not populate
the cfid->cached_dirents, only the first readdir is given
"ownership" of populating cached_dirents. However, if the
next readdir on the same FD never arrives, we will always
miss the dirent cache.

This change introduces a 10-second timeout which will be used
by laundromat thread to check if the cached_dirents can be
invalidated. Ten seconds is a long enough interval between
successive readdir calls.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
@sprasad-microsoft sprasad-microsoft force-pushed the personal/sprasad/for-next branch from 889e9ac to 2a1f960 Compare May 1, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants