-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page leak in restore_savepoint() #832
Comments
Note that the example above was slightly complicated in order to work around #830. Now that that's fixed, here's a simpler repro:
|
Nice find! I think there may still be some bugs in the savepoint handling. But I believe I fixed this one and another one in the attached PR. And I'm modifying the fuzzer to detect leaked pages |
Yay, thanks for fixing this so quickly! You probably know this already, but just in case: I noticed that 819ccf9 removed the fast-path frees for uncommitted pages, saying "This can't happen, since we only restore at the beginning of a clean transaction". But I don't think that's true: redb does allow restore_savepoint() in a dirty transaction, and in fact both of those fast-path frees were reachable. I don't think there's any need to fix this! The current code looks correct to me, and restoring in a dirty transaction doesn't seem like an important use case; it certainly doesn't need to be optimized. But if you were relying on the transaction being clean for some other safety properties, that's not actually being enforced right now. |
Oh ya, oops. I forgot I had allowed that :) Fixed in #835 |
I discovered there are still some cases where pages can leak, and added some design documentation here: #840 |
Fixed another leak here: #841 Unfortunately that PR will make |
When restoring a savepoint, any pages which have been freed and then reallocated in the time since the savepoint was created are leaked permanently. Here's the sequence of events:
When tx 4 calls restore_savepoint(), it diffs the allocator state to find out which pages have been allocated since the savepoint was created -- but page A won't show up in the diff, because it was allocated at both times! The restored freed tree does have an entry saying that page A should be freed, but restore_savepoint() clears out the old freed-tree entries without processing them. At that point, the page is no longer referenced anywhere and has been leaked.
I believe restore_savepoint() shouldn't be throwing away the old freed-tree entries without looking at them. Instead, I think it needs to either look at each entry and free the page if it's currently allocated, or (maybe nicer) remove all those pages from the allocator snapshot before doing the diff, which will cause any reallocated pages to correctly show up in allocated_since_savepoint.
The text was updated successfully, but these errors were encountered: