-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest standard implementation for electronic share set generation #64
Comments
In simulations this parity interleaving doesn't perform optimally for 2 random bit errors which seems more likely than 2 bit burst errors since the bech32 charset was designed to differ in 1 bit between confused characters. Simulation of unique bit flip errors in the payload only: 128-bit payloads: ecc_padding() CRC-2 (x^2+x+1) For 256-bit payloads: ecc_padding() For 512-bit long payloads: ecc_padding() Unless you know of a reason why bit errors are more likely to arrive in odd amounts (after 1), CRC codes are far superior to last year's idea. This would be applied to the codex32 secret's padding. It could be applied to k - 1 shares but the bits need to be obfuscated by xor with a pubkey hash or it's trivial to tell they are the first k - 1 shares by verifying the CRC bits, denting n privacy. This allows using the CRC bits to detect errors and correct erasures on the "special" shares when BIP32 fingerprint is known, while hiding they are the first k - 1 shares from a snoop without the full fingerprint. |
This all sounds good to me. It feels a little overengineered for what it, but I've said that before :). And it would help list decoders zero in on the correct decoding in cases where there are more than 4 errors. So the more correction ability the better. |
OK, I built it. Prepending the fingerprint to the data before calculating the CRC doesn't seem to hurt the error detection, even when the ID is its first 20 bits. And does blind the CRC from an observer without the other 12 bits. I also wrapped the hrp into the calculation because why not. def bin_polymod(chk_len, data):
"""Internal function that computes the CRC checksum."""
# Define the CRC polynomial (x^chk_len + x + 1)
polynomial = (0b1 << chk_len) + 0b11
crc = 0
for bit in data:
crc = (crc << 1) | int(bit) # Shift in the next bit
# If the leftmost bit (MSB) is 1, XOR with the polynomial
if crc & (0b1 << chk_len):
crc ^= polynomial
# Return the last chk_len bits as the CRC
return crc & (2 ** chk_len - 1)
def bin_hrp_fp_expand(hrp, fp):
"""Expand HRP and fingerprint into values for crc computation."""
return convertbits(fp + bytes(hrp,'utf'), 8, 1)
def bin_create_crc(hrp, data, fp=b''):
"""Compute the CRC checksum values given data and fingerprint."""
chk_len = 5 - len(data) % 5
values = bin_hrp_fp_expand(hrp, fp) + data
polymod = bin_polymod(chk_len, values + [0] * chk_len)
return convertbits([polymod], chk_len, 1) |
Huge breakthrough realization: If share indexes are cryptographically secure shuffled, they contain nearly 5-bits entropy per share for low t or n. So let's generate the share index and padding from the payload bytes with a CRC-7 (no idea which is best yet, using x^7 + x^3 + 1 now. Any suggestions what I should optimize for is appreciated. This one detects 99.99% of random 2-bit errors, and misses more at 1/128 chance) at t=2, an attacker with k-1 shares has learned 0.09 bits ~= 5 - log2(30) about the next share's payload. This doesn't meaningfully impact security, the payload is over provisioned and contains 130 unknown bits tolerating some loss. This is NOT secure to apply to an existing 16-byte seed, since its codex32 secret has a known index and 0-bits of entropy in that position. The attacker knows the index is 's', getting a 5-bit head start on the unknown 130 bits, reducing overall security to 125-bits, insufficient for BIP32 requirements. For derived shares, this checksum will usually not verify (1/128 chance). But for n < 20 the generated shares (k or k-1 if starting with an existing master seed) can be re-rolled with Relevant polynomials: |
Verrry interesting. At first blush your analysis looks correct (it starts to breakdown for high t and n, as you say, but not catastrophically ... if you have 16 outstanding shares you lose less than a bit here). |
If attacker has 16 shares he already recovered. At Worst, k - 1 = 8 indexes eliminated tells him 0.48 bits about the final share payload. But payload is 130-bits so there's still 129.5 bits security. If you make indexes public, it reduces security to 125-bits at T -1. Insufficient for BIP32 on 16-byte seeds. The "header" in this design should be called the first 5 characters, not 6 as share index is secret data derived from its payload. It's safe for 17+ byte seeds to make the index public. You could KDF each payloads' bytes to choose its share index but that destroys error correction and ability to find valid derived shares. The best sub-sharing idea is encoding a share's 45 bech32 symbols as the codex32 secret payload for an outer codex32 backup. So that can encrypt the now secret index for sub-shares. |
Besides more error correction for codex32 shares, this spares 5 bits for compact QRs. They now can support all 9 thresholds and scanning one displays the threshold and first ID character immediately. There's more room but best Idea I've had is 3 more bits of the ID just for t=3 which could sometimes fill a second character of the ID after 2 scans. Could get it to 4 bits (~90%) if dropped t>3 from compact, and maybe 5 if I used numeric encoding for threshold 2 and alphanumeric for threshold 3. |
Closing this issue because it's a duplicate of #57 and I dislike like most of the code snippets. Supporting compact CodexQRs actually required things to be much simpler. Currently writing a Also axed the idea of a 25x25 Compact CodexQR for 256-bit seeds. Not enough space in v2, too many compromises. |
Summarizes Issue #57:
While codex32 is designed to be computed by hand, should it become popular, it's likely most backups will be made with electronics. This offers opportunities to increase error detection and presents liabilities from trusting the randomness of a device. This proposal seeks to maximize benefits of electronics while reducing their downside by being deterministic and easily auditable so the device's entropy is not a single point of failure.
Generating Shares
For a fresh master seed
Inputs:
Optional inputs:
Outputs:
For an existing master seed
- Inputs and outputs are the same except:
Inputs:
PROPOSAL
In order to prevent the order of indices used from leaking secret information, the available indices (after excluding 's' and any existing shares passed) are deterministically shuffled as follows:
In order to prevent padding bits from leaking 2-4 bits per generated share, these must also be set deterministically. While all zeros works, it's wasteful as some error detection is possible with an ECC code. (potentially bit error correction for high k values.) This ECC padding is guaranteed to detect up to 1 bit error or 2 sequential bit errors:
This may help if a share is damaged beyond the correction ability of the codex32 checksum by reducing correction candidates by 1/4 to 1/16. It also checksums string length that BCH checksums can't offer error detection guarantees about.
There are two good ideas for setting the default ID so that 1) users are not burdened with creating and entering it, 2) 20-bits of the secret aren't leaked, and 3) it's easy to identify which wallet or descriptor the shares belong to or vice versa.
unique_string
.unique_string
is remembered or stored.k
andlen(payload)
in KDF'ssalt
allowing detection of errors in these with known corresponding BIP32 fingerprint andunique_string
.I prefer using the encrypted fingerprint everywhere for more privacy and its ability to detect errors in k and string length which may help emergency recovery. It's also slower to compute due to KDF so harder to grind data into.
k
shares are generated (less any provided existing codex32 strings). Generating random payloads is the largest potential leak. My proposal has two steps:derived_key
.key=derived_key
and a unique descriptiveinfo
field as themsg=
to derive the "Index seed" for shuffling as well as each new share payload until a threshold of codex32 strings exist.master_seed
is recovered fromk
shares as per the BIP and the identifier is updated based on this fresh master seed's fingerprint.It is critical that the KDF used for the derived key and the identifier encryption be computationally expensive for the hardware implementing this standard. Time costs between 2 and 10 seconds with a significant memory cost (relative to available memory) will offer the best protection for low entropy user input against grinding attacks, in acceptable user time.
Scrypt with parameters n=2 ** 20, r=8, p=1 is used above. This requires 1 GB and a couple seconds on a modern laptop. If 1GB of memory is unavailable
pbkdf2_hmac('sha512', password, salt, iterations=210_000 * 64, dklen=128)
offers similar protection with slower runtime. Using Argon2id with a minimum configuration of 1024 MiB of memory, an iteration count of 4, and 4 degrees of parallelism can offer even better protection than Scrypt and may be implemented by wallets.I do not prescribe any particular KDF but if using a different KDF or parameters than scrypt with n=2 ** 20, r=8, p=1, compliant wallet implementations MUST:
They also must generally:
4. Display the
app_entropy
, if any, unconditionally.5. Allow the user to provide at least 128-bits of
user_entropy
(128+ character minimum, 1024 is a reasonable maximum.) AFTER displaying app entropy.I welcome everyone's feedback on this proposal. #59
The text was updated successfully, but these errors were encountered: