NAS-140523 / 27.0.0-BETA.1 / NFSD: return ESTALE for snapdir entries when export lookup fails by ixhamza · Pull Request #243 · truenas/linux

ixhamza · 2026-04-02T19:09:01Z

When nfsd_cross_mnt() crosses into a mounted ZFS snapshot but rqst_exp_get_by_name() fails to resolve the sub-export (-ENOENT), the error is silently converted to success and the automount stub dentry is returned to the caller. The stub has simple_dir_operations and its file handle is encoded with gen=1 (snapshot is mounted).

This 44-byte gen=1 handle becomes a permanent trap: zfsctl_snapdir_vget() sees gen=1 matching d_mountpoint=true, returns the stub inode, and READDIR returns NFS4_OK with zero entries. The client caches this empty result indefinitely since there is no error signal to trigger re-resolution. The empty directory persists until change_info4 updates (e.g., manual snapshot creation on the server).

For zfs_snapdir exports, return -ESTALE instead of silently falling back to the automount stub. This causes the client to re-resolve via LOOKUP.

Testing

# Server: setup pool, snapshots, and export with zfs_snapdir
systemctl mask systemd-networkd-wait-online.service
zpool destroy tank 2>/dev/null; echo "" > /etc/exports; exportfs -r
zpool create -f tank -O mountpoint=/mnt/tank sda
zfs create tank/target
dd if=/dev/urandom of=/mnt/tank/target/bigfile bs=1M count=5
for i in 1 2 3 4 5; do echo "snap${i}" > /mnt/tank/target/file${i}.txt; zfs snapshot tank/target@snap${i}; done
echo 0 > /sys/module/zfs/parameters/zfs_expire_snapshot
echo '"/mnt/tank/target" *(rw,sync,no_subtree_check,no_root_squash,zfs_snapdir)' > /etc/exports
exportfs -ra
for i in 1 2 3 4 5; do ls /mnt/tank/target/.zfs/snapshot/snap${i}/ > /dev/null; done

# Client: mount and verify snapshots work
sudo umount /mnt/tank 2>/dev/null; sudo mkdir -p /mnt/tank
sudo mount -t nfs4 -o vers=4.2 192.168.18.9:/mnt/tank/target /mnt/tank
ls /mnt/tank/.zfs/snapshot/snap1/

# Server: inject a negative entry in the nfsd.export cache for snap1.
# This simulates what happens at the customer when zfs destroy triggers
# exportfs_flush (wiping the kernel export cache) and mountd temporarily
# fails to resolve the snapshot sub-export during cache repopulation —
# e.g., due to heavy concurrent activity (long-running zfs recv,
# dsl_process_async_destroys on a large pool).
echo "* /mnt/tank/target/.zfs/snapshot/snap1 $(($(date +%s) + 3600))" > /proc/net/rpc/nfsd.export/channel

# Client: drop page cache and access snap1 again.
# Without fix: snap1 appears as empty directory (NFS4_OK, zero entries).
#   nfsd_cross_mnt silently falls back to the automount stub with gen=1.
#   The client caches this permanently — no error, no retry.
# With fix: snap1 returns ESTALE.
#   nfsd_cross_mnt returns -ESTALE instead of falling back.
#   The client knows the handle is stale and will re-resolve once
#   the negative cache entry expires or is flushed.
echo 2 | sudo tee /proc/sys/vm/drop_caches
ls /mnt/tank/.zfs/snapshot/snap1/

When nfsd_cross_mnt() crosses into a mounted ZFS snapshot but rqst_exp_get_by_name() fails to resolve the sub-export (-ENOENT), the error is silently converted to success and the automount stub dentry is returned to the caller. The stub has simple_dir_operations and its file handle is encoded with gen=1 (snapshot is mounted). This 44-byte gen=1 handle becomes a permanent trap: zfsctl_snapdir_vget() sees gen=1 matching d_mountpoint=true, returns the stub inode, and READDIR returns NFS4_OK with zero entries. The client caches this empty result indefinitely since there is no error signal to trigger re-resolution. The empty directory persists until change_info4 updates (e.g., manual snapshot creation on the server). For zfs_snapdir exports, return -ESTALE instead of silently falling back to the automount stub. This causes the client to re-resolve via LOOKUP. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>

bugclerk · 2026-04-02T19:09:48Z

Jira URL: https://ixsystems.atlassian.net/browse/NAS-140523

bugclerk · 2026-04-03T13:13:15Z

This PR has been merged and conversations have been locked.
If you would like to discuss more about this issue please use our forums or raise a Jira ticket.

ixhamza · 2026-04-03T13:43:39Z

time 20:00

bugclerk · 2026-04-03T13:43:47Z

Time tracking added.

ixhamza requested a review from anodos325 April 2, 2026 19:09

ixhamza added the jira label Apr 2, 2026

bugclerk changed the title ~~NFSD: return ESTALE for snapdir entries when export lookup fails~~ NAS-140523 / 27.0.0-BETA.1 / NFSD: return ESTALE for snapdir entries when export lookup fails Apr 2, 2026

anodos325 approved these changes Apr 2, 2026

View reviewed changes

ixhamza added backport-25.10.2.2 backport-26.0.0-BETA.1 backport-26.0.0-BETA.2 labels Apr 2, 2026

ixhamza merged commit 54ad4cb into truenas/linux-6.18 Apr 3, 2026
6 checks passed

bugclerk added the no-time-tracked label Apr 3, 2026

bugclerk added the backported label Apr 3, 2026

truenas locked as resolved and limited conversation to collaborators Apr 3, 2026

bugclerk removed the no-time-tracked label Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAS-140523 / 27.0.0-BETA.1 / NFSD: return ESTALE for snapdir entries when export lookup fails#243

NAS-140523 / 27.0.0-BETA.1 / NFSD: return ESTALE for snapdir entries when export lookup fails#243
ixhamza merged 1 commit intotruenas/linux-6.18from
nfsd-snapdir-empty-readdir

ixhamza commented Apr 2, 2026

Uh oh!

bugclerk commented Apr 2, 2026

Uh oh!

Uh oh!

bugclerk commented Apr 3, 2026

Uh oh!

ixhamza commented Apr 3, 2026

Uh oh!

bugclerk commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ixhamza commented Apr 2, 2026

Testing

Uh oh!

bugclerk commented Apr 2, 2026

Uh oh!

Uh oh!

bugclerk commented Apr 3, 2026

Uh oh!

ixhamza commented Apr 3, 2026

Uh oh!

bugclerk commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants