How to Crash ZFS on Linux: Glob Snapshots over NFS
Building an experimental online backup system on Linux, using
rsync and ZFS snapshots to build a time machine like “past versions
of your files”. Sucks up 15TB of home directories and shared volumes,
then exposes the readonly results over NFS. ZFS on Linux has only
just got the ability to share snapshots via NFS, and we’ve found that
globbing across multiple snapshots over NFS reliably crashes the
online backup fileserver (nfsd threads go into D state and the NFS client
hangs, over time more and more nfsd threads go into D state).
To work around this snafu, I’ve built a tiny RESTful API web service that
runs on the backup fileserver, receives a path such as ~dcw/blah or
/vol/blurgh, discovers all distinct versions (in all snapshots) of that
path by globbing across all snapshots locally on the server, and reports
the distinct version paths via a JSON report.
These things are sent to try us!