You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the following situation, the pages that tx 1 adds to the freed tree get processed by tx 2, as expected:
tx 1:
adds page A to the freed tree
commits durably
tx 2:
commits durably, freeing page A
And if there's a live read of tx 0, that correctly blocks the pages from being freed, because they're still reachable from tx 0:
read begins (reading tx 0)
tx 1:
adds page A to the freed tree
commits durably
tx 2:
commits durably, can't free page A
But a live read of tx 1 also blocks freeing, which I believe is incorrect:
tx 1:
adds page A to the freed tree
commits durably
read begins (reading tx 1)
tx 2:
commits durably, should free page A but doesn't!
Page A isn't reachable from tx 1 -- that's why tx 1 added it to the freed tree! The last transaction it was reachable from is tx 0. So I think it should get freed here, as long as there are no live reads of tx 0 or earlier.
In fact, redb already relies on the fact that it's safe to free page A in this situation. In the first example above (with no live reads), consider what happens if we crash partway through committing tx 2: we'll have to roll back to the last durable commit, which was tx 1. So tx 1 needs to remain valid, just as if there were a live read on it! Effectively, there's always a live read on the last durable commit, since we could crash at any time and need to roll back to it.
In other words, the first example and the last example are equivalent. So if it's safe to free page A in the former case (which redb already does), it should also be safe to free it in the latter case.
By itself, fixing this won't save a significant amount of disk space, because of #829. But if we can fix #829 as well, then this'll cut the disk usage of a sequence of large transactions in half, in the common case where there are concurrent reads.
I believe the fix is trivial, and I'm happy to write it up if you want -- let me know if that would be useful!
The text was updated successfully, but these errors were encountered:
In the following situation, the pages that tx 1 adds to the freed tree get processed by tx 2, as expected:
And if there's a live read of tx 0, that correctly blocks the pages from being freed, because they're still reachable from tx 0:
But a live read of tx 1 also blocks freeing, which I believe is incorrect:
Page A isn't reachable from tx 1 -- that's why tx 1 added it to the freed tree! The last transaction it was reachable from is tx 0. So I think it should get freed here, as long as there are no live reads of tx 0 or earlier.
In fact, redb already relies on the fact that it's safe to free page A in this situation. In the first example above (with no live reads), consider what happens if we crash partway through committing tx 2: we'll have to roll back to the last durable commit, which was tx 1. So tx 1 needs to remain valid, just as if there were a live read on it! Effectively, there's always a live read on the last durable commit, since we could crash at any time and need to roll back to it.
In other words, the first example and the last example are equivalent. So if it's safe to free page A in the former case (which redb already does), it should also be safe to free it in the latter case.
By itself, fixing this won't save a significant amount of disk space, because of #829. But if we can fix #829 as well, then this'll cut the disk usage of a sequence of large transactions in half, in the common case where there are concurrent reads.
I believe the fix is trivial, and I'm happy to write it up if you want -- let me know if that would be useful!
The text was updated successfully, but these errors were encountered: