-
Notifications
You must be signed in to change notification settings - Fork 942
Workaround for LND to cause a force-close on our channel #8213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
What will be the right behaviour in this case? wondering if we may report this to LND while keeping the workaround |
The correct behaviour should be LND responding with a unilateral close after receiving the error. This is a known behaviour, and LDK handles it in a similar manner. |
Are we reporting to LND this wrong behaviour? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was an interesting read. Old LND gossip bug leads to old CLN version to be too aggressive in sending an error, so LND treats all errors as soft rather than force-closing the channel, which requires current CLN to botch the channel reestablish to explicitly initiate a force-close. I guess it makes sense, but some of these old bugs were resolved 5 years ago and yet are still reverberating between implementations. It's probably time to revisit this in spec discussion.
In the mean time this is a pragmatic solution.
ACK 92fc59b
@vincenzopalazzo It's a known issue across implementations — I assume someone must've reported it already. I'll check and file an issue if I don't find one in their repository. |
I recently encountered a corollary to this. LND sends an error but then also broadcasts its own commitment to force-close the channel, as though it expects the error it sent not to trigger its peer to force-close the channel. |
Unfortunately LND does not force close the channels on receiving an error, they blame us for this behaviour (https://github.com/lightningnetwork/lnd/blob/abb1e3463f3a83bbb843d5c399869dbe930ad94f/htlcswitch/link.go#L2119) To fix this we will send them a Bogus Channel Reestablish with 0 commitment_number and invalid last_per_commit_secret. Key Changes: - In connect_activate_subd, if we detect a stub channel, we serialize and send a bogus channel_reestablish message. Changelog-Fixed: Fixing LND's non responsive behaviour on receiving an error.
92fc59b
to
951a473
Compare
Make sure we are sending bogus channel reestablish after recovering from emergency.recover file and peer storage backup. Key Changes: - Add wait_for_log() with appropriate debug statements Sort `listclosechannels` to maintain example sequence
951a473
to
871eead
Compare
l1.daemon.wait_for_log('peer_out WIRE_ERROR') | ||
|
||
l2.daemon.wait_for_log('bad reestablish commitment_number: 0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the warning sent to the peer (l2 back to l1) over the wire, but also l1 disconnects immediately, so there's no opportunity for l1 to catch it. That would mean we either need to separately log the bad reestablish in connectd and then this test would check for that log message in l2, or otherwise we assume we're sending the bogus revocation correctly, but can't confirm with this blackbox test beyond seeing that it transmitted.
lightningd-1 2025-04-30T20:47:27.026Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-chan#1: Sending a bogus channel_reestablish message to make the peer unilaterally close the channel.
...
lightningd-1 2025-04-30T20:47:27.029Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_out WIRE_CHANNEL_REESTABLISH
lightningd-1 2025-04-30T20:47:27.029Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: peer_out WIRE_ERROR
lightningd-1 2025-04-30T20:47:27.029Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: disconnect_peer
...
0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Activating for message WIRE_CHANNEL_REESTABLISH
lightningd-2 2025-04-30T20:47:27.033Z DEBUG 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-lightningd: Telling connectd to send error 0011f7133e8b2a3e1241e601b48017f2020f1e456d1b37022912b89d61ee06508fbd008f6368616e6e656c643a207265636569766564204552524f52206368616e6e656c20663731333365386232613365313234316536303162343830313766323032306631653435366431623337303232393132623839643631656530363530386662643a20556e6b6e6f776e206368616e6e656c20666f7220574952455f4348414e4e454c5f524545535441424c495348
So I think we either need to remove this line from the test or add a separate logging event to check on the senders side (l2).
LND does not force close on us when we recover our node using emergency.recover or peer storage backups (https://github.com/lightningnetwork/lnd/blob/abb1e3463f3a83bbb843d5c399869dbe930ad94f/htlcswitch/link.go#L2119).
We will send a channel reestablish with bogus data to make them force close on us.