-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rationale re: requiring W <= Smin #11
Comments
So in working on a design for my Rust implementation of the spec (which is part of what sidetracked me from the tree construction PR -- sorry about that) I've run into some issues around this, and I want to suggest a different approach: remove the requirement The basic problem with trying to "protect" callers from depending on the initial state of the buffer is that you have to treat the beginning of each input specially. For example, if you're writing a primitive for pairing a sequence of bytes with the corresponding rolling hashes (this is a good building block for other, more intricate computations), it's not obvious what to do with the first On the other hand, if the initial state of the buffer is deterministic then you can start emitting checksums right away, and all the bytes are handled uniformly. "All zeros" is an obvious canonical state, probably the default in most languages when allocating a new byte buffer, so it's hard to see this causing trouble for implementors. Rust doesn't even expose a way to create a buffer that isn't zeroed (or set to some other specific state) unless you use |
I'm not sure I see what the utility of this is? What would you build on top of this? I don't have a terribly strong objection to defining the initial state of the ring buffer as all zeros if it can be expressed without complicating the spec substantially, though this might be a little fiddly since the declarative definitions don't actually talk about a ring buffer at all; that's an optimization (albeit an important one). So you'd have to define |
That primitive appears in my code as an iterator adapter, Extending |
Per discussion on #23, I think we're going to revert back to requiring this, so I'm re-opening this issue. |
In the current verison of the spec, we require the window size to be less than or equal to the minimum block size. The rationale for this isn't spelled out, we should work that into the spec. Brain-dumping it here for now:
The reason is that this avoids implementations needing to worry about the initial state of the ring buffer when computing a hash -- it can just be zeroed, because by the time a split is large enough to satisfy the size requirement, those initial contents will have slid out of the window, so can't affect the result. In typical usage the window size is much smaller than the minimum block size, so this doesn't seem like it has real drawbacks. It also makes the spec simpler.
The text was updated successfully, but these errors were encountered: