re-org safety: watch asset transactions and re-create proofs when necessary #419

guggero · 2023-07-26T18:51:01Z

~~Depends on #424, the first 5 commits aren't part of this PR anymore.~~

Fixes #317.

The PR is fully ready for review!

TODOs:

~~Add re-org watcher to custodian to watch received assets~~
~~Add re-org watcher to freighter to watch sent assets~~
~~Allow minting proofs to be updated in the universe/federation after re-org~~
~~Integration tests for above mentioned use cases~~
~~Unit test re-org watcher~~

This change is

ffranr

I haven't finished reviewing, looks good so far. I think you could roll the first two commits into a PR and get those merged as is. I'll try out the new review tool with this PR.

guggero · 2023-07-27T16:41:38Z

I haven't finished reviewing, looks good so far. I think you could roll the first two commits into a PR and get those merged as is. I'll try out the new review tool with this PR.

Good idea, done: #424

tapgarden/interface.go

tapcfg/config.go

tapgarden/re-org_watcher.go

proof/archive.go

tapgarden/planter.go

tapgarden/custodian.go

tapdb/universe.go

universe/interface.go

tapgarden/re-org_watcher.go

tapgarden/caretaker.go

tapgarden/re-org_watcher.go

jharveyb

Looks good! The callback in each subsystem pattern is really neat.

I don't think I saw rebroadcast logic for the actual anchor TX after a reorg, is that handled by the backing LND or did I just miss it?

ffranr

This is great. I've finished reviewing the whole PR. I couldn't see any further issues that haven't already been commented on.

The only dynamics that concern me are those that relate to multiple actions (mint, send, recv) occurring in close succession. But I think that this PR in its current state is still a big improvement and should probably be merged regardless.

guggero · 2023-08-04T13:29:14Z

I've updated the proof watching logic and also added a new itest to verify the new behavior.

Unfortunately the way we store proofs is a bit weird, so this requires a bit of an explanation.
Assume we have a chain of proofs (A mints, sends full amount to itself, then sends some to B, receiving change) of an asset with ID xyz:
ScriptKey_A_1 -> ScriptKey_A_2 -> ScriptKey_A_3/ScriptKey_B_3 where ScriptKey_x stands for a script key belonging to node x and _y stands for the block where that proof was confirmed in.

If node A minted (ScriptKey_A_1) the asset and transferred it to itself (ScriptKey_A_2), then sent part of it it out to node B (ScriptKey_B_3 and ScriptKey_A_3 for the change), then the two nodes have the following proofs on disk (and in the DB):

node A:
  proofs/xyz/ScriptKey_A_1.assetproof (num_proofs=1)
  proofs/xyz/ScriptKey_A_2.assetproof (num_proofs=2)
  proofs/xyz/ScriptKey_A_3.assetproof (num_proofs=3)

node B:
  proofs/xyz/ScriptKey_B_3.assetproof (num_proofs=3)

Now further assume that both blocks 2 and 3 are not yet fully confirmed (<6 confs) and a re-org happens that removes both anchor transactions out of the chain, then we need to update all proofs for block 2 and 3.

So for block 2 we would need to update the following files:

node A:
  proofs/xyz/ScriptKey_A_2.assetproof (replace the last proof)
  proofs/xyz/ScriptKey_A_3.assetproof (replace the second-to-last proof)

node B:
  proofs/xyz/ScriptKey_B_3.assetproof (replace the second-to-last proof)

And for block 3 we would need to update the following files:

node A:
  proofs/xyz/ScriptKey_A_3.assetproof (replace last proof)

node B:
  proofs/xyz/ScriptKey_B_3.assetproof (replace last proof)

Now the issue is that in case both anchor transactions for state 2 and 3 confirm in the same block now, we don't know in what order the confirmation notifications come in. So however we order things, the proof files where we need to update more than one proof will be temporarily invalid until we have updated both proofs.

Currently we do this by just using a dummy header verifier (see TODO in ReplaceProofInBlob).

I think this is fine for now as eventually the proofs will be valid again once all confirmation notifications were acted upon. I don't think it's worth attempting to do a full dependency tracking and resolution mechanism to address this (it would require the proof watcher to be refactored quite a bit).

Curious to hear your thoughts though.

guggero · 2023-08-09T11:59:51Z

@jharveyb

I'm a bit concerned about reading all proofs on startup, but AFAICT that will come from the DB and not files on disk?

The proofs are queried from both the file system and the DB and the first one that has a certain proof returns it. It is a bit unfortunate that we have to do this. Maybe if this becomes a bottleneck we could also store the proof's block height in the DB so we can filter more efficiently?

Would be cool to also test the wallet re-broadcast in this situation, but not needed in this PR.

I checked again, and because we're using the btcwallet's TX publish mechanism, each transaction we create in tapd is automatically re-broadcast by the wallet (now on every new incoming block I think). Since that's part of lnd's responsibility, I don't think we need to add an extra test in this repo.

Do we need to check that these blocks are actually connected?

That's also part of lnd's validation logic. All re-orgs are validated by the chain backend first before lnd sends out notifications. So again, I don't think we need to check this in tapd.

@ffranr

Are the universe stats updated if a issuence is re-orged?

Yes, once the issuer detects a re-org, it will update the universe(s) it initially sent the issuance proof to. And the universe will do an upsertAssetGen() which includes the block height of the new proof.

Also, do we restrict universe syncing to proofs which are at adequate depth?

No. But what I added is logic in the proof watcher to address this:
Let's say an asset is minted by Alice in block 100, Bob syncs his daemon with Alice's universe, a re-org happens. Alice will update the universe and let it know the proof changed. But Bob doesn't know that, and doesn't really care. BUT, Bob now receives an asset from Alice with the updated issuance proof, which is valid. And the next time Bob syncs with the Universe, he notices the roots have changed and re-syncs the proof as well.
So that shouldn't be an issue, really (can always manually re-sync to the universe if necessary).

jharveyb

Reviewed all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @ffranr, @guggero, @positiveblue, and @Roasbeef)

@ guggero thanks for double-checking on the btcwallet and lnd behavior in this situation!

Re: proof querying, yes tagging proofs with block height sounds helpful, since the current locator scheme provides no sense of time / ordering.

Also, for the archiver proof reading, would the file backend terminate early if the DB returns first, or complete with the result dropped? Wondering if slow proof reads would slow down DB reads for the same proof or set of proofs (can we short-circuit proof reads, essentially). No changes needed in this PR.

guggero · 2023-08-10T08:55:09Z

Also, for the archiver proof reading, would the file backend terminate early if the DB returns first, or complete with the result dropped? Wondering if slow proof reads would slow down DB reads for the same proof or set of proofs (can we short-circuit proof reads, essentially). No changes needed in this PR.

Reading a proof goes through each backend sequentially and the first that returns a non-error is used and the rest is then skipped. Since we add the DB first, then the file archive, the DB will always be asked first. If we feel like that might slow things down, we could perhaps switch the order of the backends. But that might have other impacts, so I'd do it in a separate PR.

guggero · 2023-08-10T15:56:08Z

The itest ran into:

test_harness.go:142: Error outside of test: *errors.errorString Error running server: received critical error from subsystem: received confirmation for unknown anchor TX 06d6e9abd34ef40a31c7b660e6b6112b44a3b1332a9b62ad4f40f0ff0d272a2b

Will take a look.

guggero · 2023-08-10T16:11:33Z

The itest ran into:

test_harness.go:142: Error outside of test: *errors.errorString Error running server: received critical error from subsystem: received confirmation for unknown anchor TX 06d6e9abd34ef40a31c7b660e6b6112b44a3b1332a9b62ad4f40f0ff0d272a2b

Will take a look.

Okay, looks like this was a flake, wasn't able to reproduce locally. But I removed the error and turned it into a log statement since this might happen in some edge cases around restart and when a block and the conf notification come in at the same time (if we process the block update first and the conf notification afterwards).

PR should be good to go now!

proof/file.go

server.go

Roasbeef · 2023-08-15T00:34:32Z

tapgarden/re-org_watcher.go

+				return
+			}
+
+			err = w.MaybeWatch(f, w.DefaultUpdateCallback())


Re the earlier suggestion, if we only pass the items that actually are in "danger" of a re-org, then we wouldn't need this check. We'd also be able to avoid reading all the proof data above here as well (for all the open assets that is).

I think even with the optimized query where we only fetch the assets not yet buried, we'll want to inspect the full proof file as there might be ancestor transactions that haven't fully confirmed yet.

Roasbeef · 2023-08-15T00:40:00Z

tapgarden/re-org_watcher.go

+					"of the chain, will update proof "+
+					"with next confirmation", txHash)
+
+				// We continue to watch the transaction until


Ahh ok I think that's the behavior I missed before: that we'll get another confirmation notification in the event of a re-org. So given that, why do we even need to reorg channel in the first place?

For now, just for logging, which was helpful for debugging the itest. In the future, we could react to it.

Roasbeef · 2023-08-16T22:26:27Z

Needs a rebase after the utxo locking PR was merged!

Roasbeef

LGTM 🏒

guggero requested review from ffranr and jharveyb July 26, 2023 18:51

guggero force-pushed the reorg-safe branch from 7d5ea20 to 4a38122 Compare July 27, 2023 14:49

ffranr reviewed Jul 27, 2023

View reviewed changes

guggero mentioned this pull request Jul 27, 2023

re-org safety: preparatory commits #424

Merged

guggero force-pushed the reorg-safe branch from 4a38122 to fb2e408 Compare July 27, 2023 16:40

guggero force-pushed the reorg-safe branch 2 times, most recently from 027128f to 10c4486 Compare July 28, 2023 15:39

guggero marked this pull request as ready for review July 28, 2023 15:40

guggero force-pushed the reorg-safe branch 2 times, most recently from d53d81e to 4118058 Compare July 28, 2023 16:58

Roasbeef requested changes Jul 28, 2023

View reviewed changes

dstadulis added the v0.3 label Jul 31, 2023

jharveyb reviewed Aug 2, 2023

View reviewed changes

tapgarden/re-org_watcher.go Outdated Show resolved Hide resolved

jharveyb reviewed Aug 2, 2023

View reviewed changes

tapgarden/caretaker.go Outdated Show resolved Hide resolved

ffranr reviewed Aug 2, 2023

View reviewed changes

tapgarden/re-org_watcher.go Show resolved Hide resolved

jharveyb reviewed Aug 2, 2023

View reviewed changes

guggero force-pushed the reorg-safe branch from 4118058 to 31fa2a6 Compare August 2, 2023 14:30

ffranr reviewed Aug 3, 2023

View reviewed changes

guggero changed the base branch from main to reorg-prepare August 3, 2023 17:01

guggero force-pushed the reorg-prepare branch from f72db92 to 735cb1d Compare August 4, 2023 08:13

guggero force-pushed the reorg-safe branch from 31fa2a6 to 17a8e64 Compare August 4, 2023 12:57

guggero mentioned this pull request Aug 4, 2023

proof file: optimize appending proof #427

Open

guggero force-pushed the reorg-safe branch 2 times, most recently from 15e2131 to 5ab7cc1 Compare August 4, 2023 13:36

guggero requested review from Roasbeef, jharveyb and ffranr August 4, 2023 13:38

guggero force-pushed the reorg-safe branch from 5ab7cc1 to 9cfba99 Compare August 9, 2023 12:02

jharveyb reviewed Aug 9, 2023

View reviewed changes

guggero force-pushed the reorg-safe branch from 9cfba99 to 555dba6 Compare August 10, 2023 08:53

guggero force-pushed the reorg-safe branch from 555dba6 to f4387ed Compare August 10, 2023 16:10

Roasbeef reviewed Aug 15, 2023

View reviewed changes

guggero force-pushed the reorg-safe branch from f4387ed to 1181ece Compare August 16, 2023 12:21

guggero requested a review from Roasbeef August 16, 2023 12:22

guggero force-pushed the reorg-safe branch from 1181ece to 1afac50 Compare August 16, 2023 12:34

Roasbeef approved these changes Aug 16, 2023

View reviewed changes

guggero added 11 commits August 17, 2023 10:20

multi: allow detecting re-orgs with lndclient

bb13f3c

tapcfg: add --reorgsafedepth daemon config

00738b4

multi: add helper function to update proof in archive

5ee1b96

tapdb: allow querying assets by anchor height

374d0ec

multi: add re-org watcher

d9842d0

tapgarden+tapcfg: use re-org watcher for mints

08cfc6c

tapgarden+tapcfg: use re-org watcher for received assets

f2e84b0

tapfreighter+tapcfg: use re-org watcher for sent assets

7852a96

multi: update universe proof after re-org

fad3d84

itest: allow minting an unconfirmed batch

9a701fd

itest: add re-org test cases

0fee0e4

guggero force-pushed the reorg-safe branch from 1afac50 to 0fee0e4 Compare August 17, 2023 08:31

guggero added this pull request to the merge queue Aug 17, 2023

Merged via the queue into main with commit 6412264 Aug 17, 2023

guggero deleted the reorg-safe branch August 17, 2023 09:12

re-org safety: watch asset transactions and re-create proofs when necessary #419

re-org safety: watch asset transactions and re-create proofs when necessary #419

Uh oh!

Conversation

guggero commented Jul 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ffranr left a comment

Choose a reason for hiding this comment

Uh oh!

guggero commented Jul 27, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jharveyb left a comment

Choose a reason for hiding this comment

Uh oh!

ffranr left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guggero commented Aug 4, 2023

Uh oh!

guggero commented Aug 9, 2023

Uh oh!

jharveyb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guggero commented Aug 10, 2023

Uh oh!

guggero commented Aug 10, 2023

Uh oh!

guggero commented Aug 10, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Roasbeef Aug 15, 2023

Choose a reason for hiding this comment

Uh oh!

guggero Aug 16, 2023

Choose a reason for hiding this comment

Uh oh!

Roasbeef Aug 15, 2023

Choose a reason for hiding this comment

Uh oh!

guggero Aug 16, 2023

Choose a reason for hiding this comment

Uh oh!

Roasbeef commented Aug 16, 2023

Uh oh!

Roasbeef left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

guggero commented Jul 26, 2023 •

edited

Loading

ffranr left a comment •

edited

Loading

jharveyb left a comment •

edited

Loading