Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize recv_fix_encryption_hierarchy() #16929

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gamanakis
Copy link
Contributor

@gamanakis gamanakis commented Jan 5, 2025

Motivation and Context

Motivation: #16317
Closes: #16317

Description

recv_fix_encryption_hierarchy() in its present state goes through all
stream filesystems, and for each one traverses the snapshots in order to
find one that exists locally. This happens by calling guid_to_name() for
each snapshot, which iterates through all children of the filesystem.
This results in CPU utilization of 100% for several minutes (for ~1000
filesystems on a Ryzen 4350G) for 1 thread at the end of a raw receive
(-w, regardless whether encrypted or not, dryrun or not).

Fix this by following a different logic: using the top_fs name, call
gather_nvlist() to gather the nvlists for all local filesystems. For
each one filesystem, go through the snapshots to find the corresponding
stream's filesystem (since we know the snapshots guid and can search
with it in stream_avl for the stream's fs). Then go on to fix the
encryption roots and locations as in its present state.

Avoiding guid_to_name() iteratively makes
recv_fix_encryption_hierarchy() significantly faster (from several
minutes to seconds for ~1000 filesystems on a Ryzen 4350G).

How Has This Been Tested?

An existing test was expanded.
Perfomance testing was done locally using the script from #16317

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label Jan 5, 2025
@gamanakis gamanakis force-pushed the enc_hierarchy branch 7 times, most recently from 0a8bf07 to 13ee306 Compare January 6, 2025 17:15
@gamanakis gamanakis changed the title [DRAFT] speed up recv_fix_encryption_hierarchy() [DRAFT] optimize recv_fix_encryption_hierarchy() Jan 6, 2025
@gamanakis gamanakis force-pushed the enc_hierarchy branch 4 times, most recently from c217393 to 3aa17e1 Compare January 7, 2025 08:57
@gamanakis gamanakis changed the title [DRAFT] optimize recv_fix_encryption_hierarchy() optimize recv_fix_encryption_hierarchy() Jan 7, 2025
@gamanakis gamanakis marked this pull request as ready for review January 7, 2025 09:00
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Jan 7, 2025
recv_fix_encryption_hierarchy() in its present state goes through all
stream filesystems, and for each one traverses the snapshots in order to
find one that exists locally. This happens by calling guid_to_name() for
each snapshot, which iterates through all children of the filesystem.
This results in CPU utilization of 100% for several minutes (for ~1000
filesystems on a Ryzen 4350G) for 1 thread at the end of a raw receive
(-w, regardless whether encrypted or not, dryrun or not).

Fix this by following a different logic: using the top_fs name, call
gather_nvlist() to gather the nvlists for all local filesystems. For
each one filesystem, go through the snapshots to find the corresponding
stream's filesystem (since we know the snapshots guid and can search
with it in stream_avl for the stream's fs). Then go on to fix the
encryption roots and locations as in its present state.

Avoiding guid_to_name() iteratively makes
recv_fix_encryption_hierarchy() significantly faster (from several
minutes to seconds for ~1000 filesystems on a Ryzen 4350G).

Signed-off-by: George Amanakis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
1 participant