-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ZK certificate verification #672
Comments
Our goal is to prove in zk that a hash pre-image is a DNS name of the end-entity certificate which links to a trusted root certificate without revealing the identity of the end-entity. Meta information like the lengths of the certificates in the chain or the lengths/offsets of certain fields in the certificates must not be revealed to the verifier, otherwise the leaked info could be used by a malicious verifier to establish the identity of the end-entity. Solutions which are incomplete or not optimal:Cert validation with Risc Zero (r0)Automata already has the needed circuits for r0 here https://github.com/automata-network/tdx-attestation-sdk/blob/af497429fb82effcc3d0ec83659a8b9d2e20aebb/zk/risc0/methods/guest/src/main.rs#L17 It takes 90M r0 cpu cycles to prove the chain https://discord.com/channels/974410303255232542/1278660294868074496/1280132404714930227 (via DMs melynx confirmed that 90M equals 30 mins on an M1 Pro machine. Although melynx said that he has some optimizations in mind which would bring down the cost to 10 mins). halo2-zkcert (https://github.com/zkCert/halo2-zkcert)Their circuits don't parse the certificates and don't chain them. They prove a statement "I know a public key which verifies a signature over a certificate". But what we need is to additionally prove that the public key was parsed out from a parent certificate and that the certificates are well-formed. Revealing fields at known offsets.This approach will not work since x509 uses variable-length encoding of its fields: ASN.1 DER. See https://datatracker.ietf.org/doc/html/rfc5280#section-4.1 It is not possible to reliably predict at which offset in the certificate a certain field will be located. Parsing with QuickSilver (QS).This approach will be too expensive both in terms of prover upload bandwidth and compute since the QS circuit has to pay the cost of multiplexing all branches. (See the circuits below) A promising solution: parsing (and hashing) with LogRobin++ (LR)(https://eprint.iacr.org/2024/1427) LR will not incur the cost of branch multiplexing like QS does. In LR, the prover still has to bear the computation cost of evaluating each branch but the branches of our circuit contain a miniscule amount of AND gates. See the circuits below. Circuits for zk x509 parsing.Below will try to use the same identifiers and terminology as in Below is the pseudocode for parsing a single field of x509. The prover's private input - the Parsing. Step 1. Parsing the serial number.The circuit below proves a statement "I know such /// Arguments are the private inputs to the circuit:
///
/// `tbsCertificate` - TBSCertificate.
/// `parsed_serial_number` - `tbsCertificate` with the `version and `serialNumber` fields
/// parsed and drained.
fn parse_serial_number(tbsCertificate: [u8; 4000], parsed_serial_number: [u8; 4000]) {
let c = tbsCertificate;
// An offset in the certificate.
let mut i = 0;
// TBS cert length.
let mut tbs_length = [0u8; 2];
tbs_length.copy_from_slice(&c[i..i + 2]);
assert!(u16::from_be_bytes(tbs_length) < 3999u16);
i += 2;
// Parse the version:
// tag 0
assert!(c[i] == 0);
i += 1;
// version length
assert!(c[i] == 1);
i += 1;
// only v3 is supported
assert!(c[i] == 2);
i += 1;
// Parse CertificateSerialNumber
// ASN.1 type: INTEGER
assert!(c[i] == 2);
i += 1;
// integer value length (let's assume it can never be < 5 or > 100)
let len = c[i];
i += 1;
let mut expected_parsed = [0u8; 4000];
for value in 5..100 {
if value == len {
expected_parsed.copy_from_slice(&c[i + value as usize..]);
}
}
assert!(expected_parsed == parsed_serial_number);
} Parsing. Step 2. Parsing the signature.We can reuse the fn parse_signature(parsed_serial_number: [u8; 4000], parsed_signature: [u8; 4000]) {
// Parse the signature in the same way we parsed the version and drain it.
// Return the drained certificate zero-padded to 4000 bytes.
unimplemented!()
} // ... And so on for each field in Parsing. Final Step.After all parsing steps are taken, the final drained certificate will contain all zeroes Unrolling with QuickSilver is expensive.Note that in order to unroll the fn unroll_with_quicksilver(c: [u8; 4000]) {
// assume the offset is 10
let i = 10;
// assume the parsed length of the field was 34
let len = 34;
// Expected value for each branch. A total of 95 branches.
let mut expected_parsed = [[0u8; 4000]; 95];
// Whether the branch is active. Will be set to `true` only for one branch.
let mut is_active = [false; 95];
if len == 5 {
expected_parsed[0].copy_from_slice(&c[i + 5..]);
is_active[0] = true;
} else if len == 6 {
expected_parsed[1].copy_from_slice(&c[i + 6..]);
is_active[1] = true;
}
// ... and so on for each iteration value
else if len == 100 {
expected_parsed[94].copy_from_slice(&c[i + 100..]);
is_active[94] = true;
}
// Multiplex all branches into a single output.
let mut multiplexed = [0u8; 4000];
for i in 0..95 {
// pseudocode: multiply each bit in `expected_parsed` by the `is_active` bit.
// If `is_active` if false, the `product` will be zero.
// Each bit multiplication will cost 1 AND gate.
let product = expected_parsed[i] * is_active;
multiplexed += product;
}
} The cost of parsing with QuickSilver.The total AND gate cost of the
Unrolling with LogRobin++ is cheap.When using LogRobin++ instead of QuickSilver, the multiplexing cost C) is not present. Additionally (this comment also applies to QS), we will combine all smaller parsing circuits into one large circuit, so that the final This will bring the A) cost to almost zero. Hashing the parsed certificate with LogRobin++ in the naive way.In order to verify the signature on the certificate, the cert needs to be hashed. Naively, we could have a circuit where in each branch a set amount of bytes get hashed. This will require a circuit with ~4000 branches and within each branch there would be up to 4000/64 (assuming a sha256 64-byte block size) = ~63 hash evaluations. In LogRobin++ the bandwidth cost is bounded by the most expensive branch. But the evaluation has to be done on ALL branches by the prover. Optimising the hashing with LogRobin++.We need to create a circuit which minimizes the amount of hashing within the branches. Instead, the branch should only compute the data to be hashed. Below is an example circuit which will bring the compute complexity to only 1.5 million AND gates. Note that in order to hide the length of the pre-image, we need to hash all 4000 bytes iof the certificate, including the padding bytes. The trick is that the padding bytes will not count towards the final digest even though they will be hashed. #[allow(clippy::too_many_arguments)]
pub fn hash_one_block(
// The ceritificate data that needs to be hashed.
cert: [u8; 4000],
// How many more bytes of the cert are left to be hashed.
// `len_remaining` is re-used from certificate parsing circuits.
// It is guaranteed to have a correct value.
len_remaining: u16,
// The hash state from the previous circuit invocation. On first invocation it will
// be set to `initial_sha256_state`.
hash_state: [u32; 32],
// args with the _out suffix can be considered as circuit "outputs" (that's how circom calls them)
// yet they are in reality private inputs used to make assertions on the state transition.
cert_out: [u8; 4000],
len_remaining_out: u16,
hash_state_out: [u32; 32],
// This is public inputs:
//
// The initial sha256 state.
initial_sha256_state: [u32; 32],
// Total data hashed so far, including the padding bytes.
total_hashed: u16,
) {
if total_hashed == 0 {
assert!(hash_state == initial_sha256_state);
};
let to_be_hashed = if len_remaining > 63 {
// We have a full block
let to_be_hashed: Vec<u8> = cert[0..64].to_vec();
let mut padded = cert[64..].to_vec();
padded.append(&mut vec![0u8; 64]);
assert_eq!(cert_out.to_vec(), padded);
assert!(len_remaining_out == len_remaining - 64);
to_be_hashed
} else {
// We have the last partial block.
//
// This is a rough approximations of sha256 padding the last block and appending the block
// length.
let mut to_be_hashed = [0u8; 64];
if len_remaining == 1 {
to_be_hashed[0..1].copy_from_slice(&cert[0..1]);
to_be_hashed[63] = 1;
} else if len_remaining == 2 {
to_be_hashed[0..2].copy_from_slice(&cert[0..2]);
to_be_hashed[63] = 2;
}
// .... And so on for each len from 1 to 63
assert!(cert_out == [0u8; 4000]);
assert!(len_remaining_out == 0);
to_be_hashed.to_vec()
};
// Permutes the state, returning a new state.
fn hash(data: Vec<u8>, state: [u32; 32]) -> [u32; 32] {
unimplemented!()
}
let new_state = hash(to_be_hashed, hash_state);
if len_remaining == 0 {
// we were hashing the padding bytes, don't change the state
assert!(hash_state == hash_state_out);
} else {
// we were hashing the actual cert, change the state
assert!(hash_state_out == new_state);
}
} This circuit can be run 63 times, each time increasing |
This paper https://eprint.iacr.org/2024/2010 has useful circuits and protocols which can be borrowed for ECDSA sig verification in zk using QuickSilver. |
We want to support having the Prover prove the certificate chain to the Notary without revealing the server identity but instead obtaining a commitment to the Server's DNS name.
The text was updated successfully, but these errors were encountered: