-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop relying on ChannelMonitor persistence after manager read #3322
Open
TheBlueMatt
wants to merge
9
commits into
lightningdevkit:main
Choose a base branch
from
TheBlueMatt:2024-06-mpp-claim-without-man
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Stop relying on ChannelMonitor persistence after manager read #3322
TheBlueMatt
wants to merge
9
commits into
lightningdevkit:main
from
TheBlueMatt:2024-06-mpp-claim-without-man
+503
−198
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When we started tracking which channels had MPP parts claimed durably on-disk in their `ChannelMonitor`, we did so with a tuple. This was fine in that it was only ever accessed in two places, but as we will start tracking it through to the `ChannelMonitor`s themselves in the coming commit(s), it is useful to have it in a struct instead.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we take the first step, building a list of MPP parts and metadata in `ChannelManager` and passing it through to `ChannelMonitor` in the `ChannelMonitorUpdate`s.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we store the required MPP parts and metadata in `ChannelMonitor`s and make them available to `ChannelManager` on load.
In a coming commit we'll use the existing `ChannelManager` claim flow to claim HTLCs which we found partially claimed on startup, necessitating having a full `ChannelManager` when we go to do so. Here we move the re-claim logic down in the `ChannelManager`-read logic so that we have that.
Here we wrap the logic which moves claimable payments from `claimable_payments` to `pending_claiming_payments` to a new utility function on `ClaimablePayments`. This will allow us to call this new logic during `ChannelManager` deserialization in a few commits.
In the next commit we'll start using (much of) the normal HTLC claim pipeline to replay payment claims on startup. In order to do so, however, we have to properly handle cases where we get a `DuplicateClaim` back from the channel for an inbound-payment HTLC. Here we do so, handling the `MonitorUpdateCompletionAction` and allowing an already-completed RAA blocker.
When we claim an MPP payment, then crash before persisting all the relevant `ChannelMonitor`s, we rely on the payment data being available in the `ChannelManager` on restart to re-claim any parts that haven't yet been claimed. This is fine as long as the `ChannelManager` was persisted before the `PaymentClaimable` event was processed, which is generally the case in our `lightning-background-processor`, but may not be in other cases or in a somewhat rare race. In order to fix this, we need to track where all the MPP parts of a payment are in the `ChannelMonitor`, allowing us to re-claim any missing pieces without reference to any `ChannelManager` data. Further, in order to properly generate a `PaymentClaimed` event against the re-started claim, we have to store various payment metadata with the HTLC list as well. Here we finally implement claiming using the new MPP part list and metadata stored in `ChannelMonitor`s. In doing so, we use much more of the existing HTLC-claiming pipeline in `ChannelManager`, utilizing the on-startup background events flow as well as properly re-applying the RAA-blockers to ensure preimages cannot be lost.
When we discover we've only partially claimed an MPP HTLC during `ChannelManager` reading, we need to add the payment preimage to all other `ChannelMonitor`s that were a part of the payment. We previously did this with a direct call on the `ChannelMonitor`, requiring users write the full `ChannelMonitor` to disk to ensure that updated information made it. This adds quite a bit of delay during initial startup - fully resilvering each `ChannelMonitor` just to handle this one case is incredibly excessive. Over the past few commits we dropped the need to pass HTLCs directly to the `ChannelMonitor`s using the background events to provide `ChannelMonitorUpdate`s insetad. Thus, here we finally drop the requirement to resilver `ChannelMonitor`s on startup.
Because the new startup `ChannelMonitor` persistence semantics rely on new information stored in `ChannelMonitor` only for claims made in the upgraded code, users upgrading from previous version of LDK must apply the old `ChannelMonitor` persistence semantics at least once (as the old code will be used to handle partial claims).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When we discover we've only partially claimed an MPP HTLC during
ChannelManager
reading, we need to add the payment preimage toall other
ChannelMonitor
s that were a part of the payment.We previously did this with a direct call on the
ChannelMonitor
,requiring users write the full
ChannelMonitor
to disk to ensurethat updated information made it.
This adds quite a bit of delay during initial startup - fully
resilvering each
ChannelMonitor
just to handle this one case isincredibly excessive.
Instead, we rewrite the MPP claim replay logic to use only (new) data included in
ChannelMonitor
s, which has a nice side-effect of teeing up futureChannelManager
-non-persistence features as well as makes ourPaymentClaimed
event generation much more robust.