zap: add zap_cursor_init_by_dnode; cursor unit tests; mock dnode refcounts#18603
zap: add zap_cursor_init_by_dnode; cursor unit tests; mock dnode refcounts#18603robn wants to merge 6 commits into
zap_cursor_init_by_dnode; cursor unit tests; mock dnode refcounts#18603Conversation
amotin
left a comment
There was a problem hiding this comment.
Does it make some real benefit to postpone dnode hold in case of zap_cursor_init()? Is there some code that calls it, but then don't traverse, that would rationalize the complexity?
The only case I can think of is when the ZAP is empty then there's nothing to traverse and no hold. But this doesn't seem like an optimization worth the complexity to me. Always taking the dhold in |
|
I would rather do the dnode and the
I'll have a go of the first, see how it looks. |
|
Ok, last push does all the setup in I started converting callers, but most ZAP cursor loops look this (from for (zap_cursor_init(&zc, mos,
dsl_dataset_phys(ds)->ds_next_clones_obj);
zap_cursor_retrieve(&zc, za) == 0;
zap_cursor_advance(&zc)) {Error checking the init seperately ends up looking like: if (zap_cursor_init(&zc, mos,
dsl_dataset_phys(ds)->ds_next_clones_obj) == 0) {
for(; zap_cursor_retrieve(&zc, za) == 0;
zap_cursor_advance(&zc)) {
...
}
zap_cursor_fini(&zc);
}This is a lot harder to read, especially in the cases where the additional indent pushes the inner block beyond 80 columns. It's also kind of pointless because the loop already didn't care about cursor errors; treating (say) So instead I've gone closer to the "defer error report" idea, but since most callers don't actually care about the specific error, instead I've made it so a failed init will zero enough of the I think I'm happy with this. Let me know if its cool, and I'll fold it back into the original commit and reorganise it all (no point cluttering up the commit history adding the dnode hold defer only to immediately remove it. |
amotin
left a comment
There was a problem hiding this comment.
I have no objections, looks cleaner to me. Just zc_flags change is not needed any more, unless you have other plans for it, and CI seems not happy about something.
|
Yeah, I'm pretty happy with how this turned out too. This does seem to have uncovered some issue though which the |
If the cursor were ever to actively hold resources, not finalising it would mean leaking those resources whenever the scrub is paused. The cursor is already reinitialized from the stored serialized form if/when it is resumed, so there's nothing we need from the old one, just to release it. Sponsored-by: TrueNAS Signed-off-by: Rob Norris <rob.norris@truenas.com>
This commit adds zap_cursor_init_by_dnode() (and
zap_cursor_init_serialized_by_dnode()), which allow the target ZAP to
provided via an existing dnode rather than the traditional objset+object
pair.
This requires some reorganisation of the way that zap_cursor_t is
initialised. Up until now, zap_cursor_init() has merely stored the
objset, object, serialized form and prefetch flag, and left it until
zap_cursor_retrieve() to actually call zap_lock(). This makes a
_by_dnode() form complicated, because it is a held resource that needs
to be released, but might not be used if zap_cursor_retrieve() is not
called. So there's a bunch of state tracking required.
However, all cursor users immediately follow zap_cursor_init() with
zap_cursor_retrieve(), so there's nothing gained by delaying holds. This
allows us to simplify things, by calling zap_lock() directly in
zap_cursor_init() and retaining it until zap_cursor_fini().
This does however means the _init() functions are now fallible, and can
return an error. This adds complexity to most of the call sites, which
are typically in a for loop of the form:
for (zap_cursor_init(...);
zap_cursor_retrieve(...) == 0;
zap_cursor_advance(...))
To avoid needing to make significant changes at every call site, a
failed _init() call will also zero the cursor struct. If the caller
doesn't check the return and continues to zap_cursor_retrieve(), they
will get an EIO return, and zap_cursor_fini() will just return.
The existing zc_objset and zc_zapobj fields are retained to support
source backcompat for Lustre, which inspects them directly.
Sponsored-by: TrueNAS
Signed-off-by: Rob Norris <rob.norris@truenas.com>
The thing under test will be taking and releasing dnode refs/holds. By counting them and exposing the current count, we can assert in test cleanup that we haven't missed releasing any, especially in cases where the hold is held across multiple test steps. Sponsored-by: TrueNAS Signed-off-by: Rob Norris <rob.norris@truenas.com>
It should be back at 1, where it started. Sponsored-by: TrueNAS Signed-off-by: Rob Norris <rob.norris@truenas.com>
|
Last push rebases to master and takes care of the outstanding comments.
Commit stack reordered, everything related to the ZAP changes proper now in a single commit. It's pretty nice I think, good feedback everyone! |
These add a bunch of entries to the ZAP, and then ensure that a cursor walk over the ZAP sees them all once and once only, and no others. The serialization test takes it a bit further, by serializing and recreating the cursor half way through and confirming it correctly picks up from the same spot, and then recreating the cursor from serialized again and confirming that it also see only the second set of entries. This ensures that the serialized cursor state is fully self contained and not reliant on anything left over in the ZAP itself at serialization time. Sponsored-by: TrueNAS Signed-off-by: Rob Norris <rob.norris@truenas.com>
Cursors defer taking holds until they're needed, so if a cursor is created but not used, it may still hold resources that it would have cleaned up along the way, but never got chance to. (this really happened in the first version of zap_cursor_init_by_dnode(), so not a contrived case!) Sponsored-by: TrueNAS Signed-off-by: Rob Norris <rob.norris@truenas.com>
This commit adds zap_cursor_init_by_dnode() (and
zap_cursor_init_serialized_by_dnode()), which allow the target ZAP to
provided via an existing dnode rather than the traditional objset+object
pair.
This requires some reorganisation of the way that zap_cursor_t is
initialised. Up until now, zap_cursor_init() has merely stored the
objset, object, serialized form and prefetch flag, and left it until
zap_cursor_retrieve() to actually call zap_lock(). This makes a
_by_dnode() form complicated, because it is a held resource that needs
to be released, but might not be used if zap_cursor_retrieve() is not
called. So there's a bunch of state tracking required.
However, all cursor users immediately follow zap_cursor_init() with
zap_cursor_retrieve(), so there's nothing gained by delaying holds. This
allows us to simplify things, by calling zap_lock() directly in
zap_cursor_init() and retaining it until zap_cursor_fini().
This does however means the _init() functions are now fallible, and can
return an error. This adds complexity to most of the call sites, which
are typically in a for loop of the form:
for (zap_cursor_init(...);
zap_cursor_retrieve(...) == 0;
zap_cursor_advance(...))
To avoid needing to make significant changes at every call site, a
failed _init() call will also zero the cursor struct. If the caller
doesn't check the return and continues to zap_cursor_retrieve(), they
will get an EIO return, and zap_cursor_fini() will just return.
The existing zc_objset and zc_zapobj fields are retained to support
source backcompat for Lustre, which inspects them directly.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@truenas.com>
Closes #18603
The thing under test will be taking and releasing dnode refs/holds. By counting them and exposing the current count, we can assert in test cleanup that we haven't missed releasing any, especially in cases where the hold is held across multiple test steps. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18603
It should be back at 1, where it started. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18603
These add a bunch of entries to the ZAP, and then ensure that a cursor walk over the ZAP sees them all once and once only, and no others. The serialization test takes it a bit further, by serializing and recreating the cursor half way through and confirming it correctly picks up from the same spot, and then recreating the cursor from serialized again and confirming that it also see only the second set of entries. This ensures that the serialized cursor state is fully self contained and not reliant on anything left over in the ZAP itself at serialization time. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18603
Cursors defer taking holds until they're needed, so if a cursor is created but not used, it may still hold resources that it would have cleaned up along the way, but never got chance to. (this really happened in the first version of zap_cursor_init_by_dnode(), so not a contrived case!) Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18603
Motivation and Context
Continuing quest towards full unit test coverage for ZAPs
Description
The main thing is to add
zap_cursor_init_by_dnode()(and its close friend,zap_cursor_init_serialized_by_dnode()).This is not as simple as just renaming and adjusting the old code and makingzap_cursor_init()wrap it, because cursors do not actually callzap_lock()and do stuff until the first call tozap_cursor_retrieve(), which may not be called at all. So instead we have to hold the dnode if provided until that call (if it ever comes), and for the objset+object version, take our own dnode hold there. And make sure we handle all the places where we don't end up using. Not that complicated in the end, but was a bit squirly along the way.Edit: following review, we ended up doing away with all this, and just taking the ZAP hold (
zap_lock()etc) from the init funcs. See the comments for more details. I kept the mock dnode refcounting though; it's not wrong and will undoubtedly be useful.To help with this, I've added light refcounting to mock dnodes. Nothing fancy, but enough that in test cleanup, we can now test if the refcount has returned to 1 and fail the test if so. This has been wired through to all tests (actually into
mock_zap_destroy(), which they all call). Then I used them to add some tests to help make sure I got it correct inzap_cursor_init_by_dnode(). Which I didn't, the first time! Very useful indeed!Finally, the actual thing - cursor tests. These are just adding a bunch of things, walking over them and making sure the right things come back via a few different scenarios.
(All this is described in the commit messages and comments).
I've been holding back a couple of "cursor-related" test sets (
zap_joinandzap_value_search) until I had coverage for cursors proper. I was going to include them in this PR, but they have just enough of their own stuff going on that I think it would muddle review too much, so I'll get them in a future one.I also note that
zap_cursor_tand its setup is a little ugly now. I think its straightforward to tidy (mostly, split the "init" parts from the "cursor" parts in some way) but for now I wanted to keep the code changes as small as I could, and not change the memory footprint at all. I'll get to that once all this has settled.How Has This Been Tested?
It's all tests! But I've run a handful of sanity checks from ZTS as well, just to make sure that normal
zap_cursor_init()path isn't damaged, and those are fine too. I'll let CI alert to anything else.Coverage before:
after:
Types of changes
Checklist:
Signed-off-by.