-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken dedup of unsupported packets causes key update churn #198
Comments
By adding a couple of logging lines to
It is interesting to note however, that:
I found a small number of exceptions to (*) above, where a few keys were correctly deduped during a doubling cycle, and in all such cases the number of packets was not halved:
After further investigation it turns out that the
The uuid is created by this snippet in
As we can see, there are two packets in the offending key above that have exactly the same MD5 sum (9327fd41cd96ba66c0602ef1e714d933) but different uuid prefixes. These are therefore not being recognised as duplicates of each other. UUID population is identical across the non-primary packet parsers:
however Other packets see different
So let's log which branch is taken and what uuid gets passed to ParseOther. Note that the other debug outputs are the node uuids (actually
Here's an example of a correctly deduping merge:
Note that the badParent of both bad packets has been identified as the primary key - a primary key's uuid is always its rfingerprint. And here's a broken one:
The obvious difference with the properly deduping example is that the bad packet on the wire has been assigned to a non-primary-key parent packet (the badParent's UUID is not a fingerprint, so must itself be a child), while the one from disk has been assigned to the primary key. This would appear to indicate one of the following: a. the parent context of Other packets is being lost on (de)serialisation to disk The simple solution may be to simply drop all Other nodes. If we are incapable of handling them correctly, we shouldn't handle them at all. Note also that after a particularly heavy doubling session, a torrent of prefix tree errors appears in the logs (see #170 (comment)) |
This appears to be a duplicate of #9 😬 |
This also appears to be partially caused by the existence of V5 signatures in the dataset, which ProtonMail/go-crypto cannot (yet?) parse. |
It looks like #9 is a red herring - the duplicate packet handling code introduced there was refactored out ages ago. |
There's still something off with the parsing of unsupported algorithm types. The offending key in the OP is displayed as an unknown key type pub unk(#0)0/d3dcb78aa839005e510fa5d3864469ccea5b31e8 Is it worth fixing this? Or should we just drop these keys entirely? |
It seems this is either caused by, or very closely related to #199. Validating direct sigs made the churn orders of magnitude worse, but only after peering with a node that had freshly restored from dump (and thus had an accurate PTree). |
So. It looks like the issue may have been a subtle inconsistency between the ser/de behaviour of OpaquePacketReader and jsonhkp when encountering unexpected packets. Keys coming in on the wire from an SKS peer are parsed via OpaquePacketReader, but keys read from disk are parsed via jsonhkp, and the bad packets were being attached to the key in different places, leading to inconsistent context tagging, which broke the deduper. Dropping bad packets with extreme prejudice (in #135) appears to have cleaned up the vast majority of churn - there are still issues with misplaced good signatures but I will treat them separately in #278. (I have opened #282 to track the length fields mentioned in the OP) |
Update to v2.2; this version includes the following features: * Fully stable SKS recon using aggressive normalisation (#198) * Improved multithreading safety (#170) * Deletion of personal data from hard-revoked keys (#250) * Admin deletion of keys via signed submissions * Detached revocation certificate support (#281) And the following bugfixes: * Missing direct key signature validation (#199) * Missing subkeys with v3 sbinds (#205) * Missing CORS headers (#226) * HTTPS binding errors (#295) * Several cosmetic improvements (#257 #289 #291 ...) It also drops the following deprecated features: * SKS-keyserver recon compatibility * UAT image packets * User deletion and replacement of keys via /pks/delete and /pks/replace endpoints
I finally managed to catch specific examples of key-update churn in the process of being propagated across my test network. Here are the contents of a particular key on two different nodes, one just updated and one not yet updated:
We can see that the second node's copy has a duplicate packet in the
unsupported
(i.e.Others
) area. Widening our search we see all such churny keys (to a mild degree of confidence) are either v3 keys, v4 keys with elGamal-sig primary algos, or keys with no keyid in their self-sigs. This makes sense as these are the most likely sources of unsupported packets.It is not immediately clear why dedup isn't working; investigations continue.
(Note also that the
length
fields are complete fiction)The text was updated successfully, but these errors were encountered: