Skip to content

use notify prune from upstream#137

Closed
hbirth wants to merge 2 commits intoDDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1
Closed

use notify prune from upstream#137
hbirth wants to merge 2 commits intoDDNStorage:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1from
hbirth:redfs-ubuntu-hwe-6.17.0-16.16-24.04.1

Conversation

@hbirth
Copy link
Copy Markdown
Collaborator

@hbirth hbirth commented Mar 25, 2026

No description provided.

@hbirth hbirth requested a review from bsbernd March 25, 2026 12:39
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from 4748957 to d25cddd Compare March 25, 2026 12:46
bsbernd
bsbernd previously approved these changes Mar 25, 2026
@bsbernd
Copy link
Copy Markdown
Collaborator

bsbernd commented Mar 25, 2026

@yongzech Please also take a look.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Mar 25, 2026

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from d25cddd to f297a15 Compare March 25, 2026 14:32
@hbirth hbirth requested a review from bsbernd March 25, 2026 14:33
@yongzech
Copy link
Copy Markdown
Collaborator

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

d_invalidate basically calls shrink_dcache_parent, but also handle submounts

fs/fuse/dev.c Outdated
case FUSE_NOTIFY_INC_EPOCH:
return fuse_notify_inc_epoch(fc);

case FUSE_NOTIFY_PRUNE:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have libfuse interface for FUSE_NOTIFY_PRUNE ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no ... but we have to start at the kernel ... the libfuse code is in a PR

fs/fuse/inode.c Outdated
if (S_ISDIR(inode->i_mode)) {
dentry = d_find_alias(inode);
if (dentry) {
shrink_dcache_parent(dentry);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this only remove the children which with ref count is zero?

fs/fuse/inode.c Outdated
}
}

d_prune_aliases(inode);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, only when d_lockref.count is zero. But when lost a lock, we need to invalidate dcache entry

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do at the moment ... that's why I left that code in

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think issue with d_invalidate is if it is currently used for for something, which is why Miklos is suggesting shrink_dcache_parent()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the problem with shrink_dcache_parent and `d_prune_aliases is that, if another node delete a file whose dcache already cached in this node, then don't invalidate dcache.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Mar 30, 2026

@yongzech your patch contained the explicit invalidation of the fuse cache ... Miklos left that out. He probably relies on some mechanism that the cache will be cleaned anyway. Was there a special reason why you invalidated the fuse cache explicitly? It actually just sets the time to 0.

d_invalidate basically calls shrink_dcache_parent, but also handle submounts

Miklos said that he does not want to dehash the dentry ... but I think WE want exactly that ... that's why I think at the moment we need both

@bsbernd
Copy link
Copy Markdown
Collaborator

bsbernd commented Mar 30, 2026

Well, let's say something is currently actively using the dentry while it gets invalidated - the racing process would then work with an unhashed dentry. I think the critical call is

void __d_drop(struct dentry *dentry)
{
	if (!d_unhashed(dentry)) {
		___d_drop(dentry);
		dentry->d_hash.pprev = NULL;
		write_seqcount_invalidate(&dentry->d_seq);
	}
}

And there the question is if this couldn't cause a null pointer deref at some point.

@hbirth
Copy link
Copy Markdown
Collaborator Author

hbirth commented Apr 7, 2026

@yongzech do we want FUSE_NOTIFY_PRUNE?
AFAICT we need both functionalities ... the 'nice' FUSE_NOTIFY_PRUNE, that will remove all unneeded entries, if possible and the one that dehashes the dentry and invalidates it reliably, because we lost a lock.

hbirth added 2 commits April 10, 2026 10:17
Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
(cherry picked from commit ad21e5a)
Fix uninterruptible sleep (D state) hangs during FUSE filesystem
teardown when using io_uring. The issue manifests as processes stuck
waiting for requests that are never completed, particularly affecting
force requests like FUSE_FLUSH or when requests are created after
fuse_abort_conn() already finished.

If on daemon exit
io_uring_try_cancel_requests() runs and  calls fuse_uring_cancel()
which will teardown the entries by calling fuse_uring_entry_teardown()
before fuse_abort_conn() then we end up in fuse_uring_abort with
queue_refs == 0 and the queues are never stopped.

If the queues are stopped all new requests will be rejected, but
that does not happen, so all new calls are stuck.

Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
(cherry picked from commit 9550b4d)
@hbirth hbirth force-pushed the redfs-ubuntu-hwe-6.17.0-16.16-24.04.1 branch from f297a15 to 48c9b7a Compare April 10, 2026 08:23
@hbirth hbirth closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants