Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: hash: add Clone and CloneXOF #69521

Open
FiloSottile opened this issue Sep 18, 2024 · 4 comments
Open

proposal: hash: add Clone and CloneXOF #69521

FiloSottile opened this issue Sep 18, 2024 · 4 comments
Labels
Proposal Proposal-Crypto Proposal related to crypto packages or other security issues
Milestone

Comments

@FiloSottile
Copy link
Contributor

There isn't a general way to clone the state of a hash.Hash, but #20573 introduced the concept of hash.Hash implementations also implementing encoding.BinaryMarshaler and encoding.BinaryUnmarshaler, and the hash.Hash docs commit our implementations to doing that.

Hash implementations in the standard library (e.g. hash/crc32 and crypto/sha256) implement the encoding.BinaryMarshaler and encoding.BinaryUnmarshaler interfaces.

That allows cloning the hash state without recomputing it, as done in HMAC.

go/src/crypto/hmac/hmac.go

Lines 96 to 103 in db40d1a

marshalableInner, innerOK := h.inner.(marshalable)
if !innerOK {
return
}
marshalableOuter, outerOK := h.outer.(marshalable)
if !outerOK {
return
}

However, it's obscure and pretty clunky to use.

I propose we add a hash.Clone helper function.

package hash

// Clone returns a separate Hash instance with the same state as h.
//
// h must implement encoding.BinaryMarshaler and encoding.BinaryUnmarshaler,
// or be provided by the Go standard library. Otherwise, Clone returns an error.
func Clone(h Hash) (Hash, error)

In practice, we should only fallback to BinaryMarshaler + BinaryUnmarshaler for the general case, while for standard library implementations we can do an undocumented interface upgrade to interface { Clone() Hash }. In that sense, hash.Clone is a way to hide the interface upgrade as a more discoverable and easier to use function.

(Yet another example of why we should be returning concrete types everywhere rather than interfaces.)

CloneXOF

If #69518 is accepted, I propose we also add hash.CloneXOF.

package hash

// CloneXOF returns a separate XOF instance with the same state as h.
//
// h must implement encoding.BinaryMarshaler and encoding.BinaryUnmarshaler,
// or be provided by the Go standard library or by the golang.org/x/crypto module
// (starting at version v0.x.y). Otherwise, Clone returns an error.
func CloneXOF(h XOF) (XOF, error)

None of our XOFs actually implement BinaryMarshaler + BinaryUnmarshaler, but they have their own interface methods Clone() ShakeHash and Clone() XOF that each return an interface. I can't really think of a way to use them from CloneXOF, so instead we can add hidden methods CloneXOF() hash.XOF and interface upgrade to them.

As we look at moving packages from x/crypto to the standard library (#65269) we should switch x/crypto/sha3 and x/crypto/blake2[bs] from returning interfaces to returning concrete types, at least for XOFs. Then they can have a Clone() method that returns a concrete type, and a CloneXOF() method that returns a hash.XOF interface and enables hash.CloneXOF.

(If anyone has better ideas for how to make this less redundant, I would welcome them. I considered and rejected using reflect to call the existing Clone methods because hash is a pretty core package. This sort of interface-method-that-needs-to-return-a-value-implementing-said-interface scenarios are always annoying.)

/cc @golang/security @cpu @qmuntal (who filed something similar in #69293, as I found while searching refs for this)

@FiloSottile FiloSottile added Proposal Proposal-Crypto Proposal related to crypto packages or other security issues labels Sep 18, 2024
@FiloSottile FiloSottile added this to the Proposal milestone Sep 18, 2024
@gabyhelp
Copy link

@magical
Copy link
Contributor

magical commented Sep 18, 2024

How do you intend to implement the fallback path in Clone? MarshalBinary + UnmarshalBinary only gives you the ability to save and restore a Hash's state, not construct a new instance of it. You might be able to do it with reflect but it sounds like we're trying to avoid reflect.

@magical
Copy link
Contributor

magical commented Sep 19, 2024

I'll also point out that, when using Clone to optimize repeated hashes with the same prefix, you really want to pair it with a Set method which copies the hash state from one existing instance to another.

e.g.

h := newHash()
h.Write(prefix)
h0 := h.Clone() // allocates
for _, message := range messages {
   h.Set(h0) // doesn't allocate
   h.Write(message)
   fmt.Println(h.Sum(nil))
}

Otherwise, without Set, you end up creating unavoidable garbage on every message.

h0 := newHash()
h0.Write(prefix)
for _, message := range messages {
   h := h0.Clone() // allocates
   h.Write(message)
   fmt.Println(h.Sum(nil))
}

My initial stab at the hmac optimization (before hashes implemented BinaryMarshaler and BinaryUnmarshaler) combined Clone and Set into a single method:

type HashCloner interface {
    hash.Hash

    // Clone returns a copy of its reciever, reusing the provided Hash if possible
    Clone(hash.Hash) hash.Hash
}

It looks a little odd but ends up being fairly ergonomic. One advantage this definition has over separate Clone and Set methods is that there is a nice answer for what to do when the argument and receiver types do not match: just allocate a new value. Whereas Set would have to panic.

// Example implementation for sha1.digest
func (d0 *digest) Clone(h hash.Hash) hash.Hash {
	d, ok := h.(*digest)
	if !ok {
		d = new(digest)
	}
	*d = *d0
	return d
}
h0 := newHash()
h0.Write(prefix)
var h hash.Hash
for _, message := range messages {
   h = h0.Clone(h)  // allocates on first iteration, reuses h on subsequent iterations
   h.Write(message)
   fmt.Println(h.Sum(nil))
}

@qmuntal
Copy link
Contributor

qmuntal commented Sep 19, 2024

This proposal tackles the same problem that made me start drafting #69293: there is no clean way to clone hashes.

About the proposed hash.Clone function, I had the same concerns as @magical. How do you instantiate a new hash before calling UnmarshalBinary? This I why in #69293 I proposed to define a hash.Cloner interface.

My motivation to have a common way to clone hash objects is to improve the compatibility of several hash implementations that I've implemented using CNG and OpenSSL with those libraries that are currently using MarshalBinary + UnmarshalBinary to clone a hash. The issue is that CNG/OpenSSL don't provide an API to serialize the internal state of a given hash, but they do provide APIs to clone a hash object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Proposal-Crypto Proposal related to crypto packages or other security issues
Projects
Status: Incoming
Development

No branches or pull requests

4 participants