Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zinject: count matches and injections for each handler #16938

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

robn
Copy link
Member

@robn robn commented Jan 9, 2025

[Sponsors: Klara, Inc., Wasabi Technology, Inc.]

Motivation and Context

When building tests with zinject, it can be quite difficult to work out if you're producing the right kind of IO to match the rules you've set up. Imagine if instead, when you run zinject and get the list of handlers, it also showed you how many times it had matched one, and how many times it actually injected one (can be different when using frequencies).

Imagine no longer!

 ID  POOL             GUID              TYPE   ERROR       FREQ       MATCH   INJECT
---  ---------------  ----------------  -----  ----------  ---------  ------  ------
  1  tank             3f9c3fe3bdb3e053  write  io          100.0000%       0       0
  2  tank             3f9c3fe3bdb3e053  write  nxio        100.0000%     305     305
  3  tank             3f9c3fe3bdb3e053  flush  io          100.0000%       9       9
  4  tank             3f9c3fe3bdb3e053  flush  nxio        100.0000%       0       0

Description

Extends zinject_record_t with two new fields, zi_match_count and zi_inject_count.

For each injection type, increments one or both of these as appropriate. For record types that don't have a notion of "frequency", they are necessarily incremented together.

Then, for the injection records that have appropriate display, show the counts. Device and object injections show both, while delay shows only the single count. The others don't have reasonable display methods and/or counts don't really make sense for them, so I haven't bothered showing them.

This is technically an ABI break: zinject_record_t is now 16 bytes longer than it was. Old userspace on new kernel won't care, but new userspace on old kernel may read garbage or segfault on those count fields. I don't really care; I consider the injection facility to be a swiss-army chainsaw tied deeply into the matching module version, and a crashing userspace program is the least of your concerns. If I'm wrong, say so, and I'll see what I can do to make the ABI a bit more stable.

REVIEW BONUS!

@amotin suggested that with a small adjustment, we could usefully count delay handler matches separately from injections. We can, and it's very nice, so that's added as a second commit.

 ID  POOL             GUID              DELAY (ms)  LANES  FREQ       MATCH   INJECT
---  ---------------  ----------------  ----------  -----  ---------  ------  ------
  1  tank             6889df1eaa5c864a          50      1  100.0000%      79      39
  2  tank             6889df1eaa5c864a         100      2  100.0000%      79      18
  3  tank             6889df1eaa5c864a         150      4  100.0000%      79      22
  4  tank             6889df1eaa5c864a         200      8  100.0000%      79       0

How Has This Been Tested?

Test included.

ZTS runs on a handful of zinject-using tests pass cleanly.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

module/zfs/zio_inject.c Outdated Show resolved Hide resolved
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jan 9, 2025
@tonyhutter
Copy link
Contributor

This is a really nice quality of life feature. I will definitely be using it!

@robn
Copy link
Member Author

robn commented Jan 10, 2025

This is a really nice quality of life feature. I will definitely be using it!

For real! I've had a hacky version of this for about a year and I while I still don't love zinject for a bunch of reasons, I can't imagine how I ever used it without this!

robn added 2 commits January 10, 2025 12:30
When building tests with zinject, it can be quite difficult to work out
if you're producing the right kind of IO to match the rules you've set
up.

So, here we extend injection records to count the number of times a
handler matched the operation, and how often an error was actually
injected (ie after frequency and other exclusions are applied).

Then, display those counts in the `zinject` output.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
It turns out we can count a matching delay injection handler earlier, by
counting it once we decide its worth considering if it has a spare lane
for this IO.

With that in hand, it becomes useful to show separate match and inject
counts, as well as the frequency, so update the output and the test to
show all that.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Suggested-by: Alexander Motin <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants