-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to truncate decorator data when serializing MAST #1580
Comments
With the advice map, calculating the decorator offset becomes a little more complicated. Potentially a SizeCalculator is warranted. You would first use SizeCalculator to dry-run all writing operations that you want to calculate, then write the decorator data offset, then write the data: struct SizeCalculator {
total_size: usize,
}
impl SizeCalculator {
fn new() -> Self {
Self { total_size: 0 }
}
fn size(&self) -> usize {
self.total_size
}
}
impl ByteWriter for SizeCalculator {
fn write_u8(&mut self, _value: u8) {
// Increment size for a single byte
self.total_size += 1;
}
fn write_bytes(&mut self, bytes: &[u8]) {
// Increment size by the length of the byte slice
self.total_size += bytes.len();
}
// All other methods from ByteWriter can remain unimplemented
// because they eventually rely on write_bytes or write_u8.
} I can see some brittleness of this approach if the concrete class of Using size_hints seems even more brittle because then any change in structure/implementation may require modifying size_hint, which seems like a bigger mental burden compared to Also not sure what performance implications would be if there is a large number of nodes. I would expect Dry-Run to be significantly faster because it is not actually writing to memory, but the CPU still has to cycle through all the nodes and relevant serialization logic. |
Do you mean the advice map referred to here? And is the problem that we don't know the size of the advice map before serializing it? If so, we have two potential solutions (maybe more):
|
Yes, the original idea was to calculate decorator data offset. But actually, I'm not sure this is needed. If CLI gets a flag to truncate decorators, it could create mast forest without decorators and write that to the binary. (no need for decorator data offset, because no decorators are written). The only reason to actually store the "decorator data section offset" is if you wanted to take an existing binary and truncate it without first reading it. And I'm not sure that the "without first reading it" part is possible. I'm seeing these two potential workflows:
If we do need to store the offset, here are some trade offs: Serializing into vector, calculate size, then serialize into binary. For anything that uses How big can |
Shouldn't be too big for the use cases we are thinking about - probably less then a few dozen KB (and in most case probably much less than that).
One possibility is to read the first few bytes of a MAST forest and then based on that read only the required data (e.g., don't read the decorator portion). But not sure where we'll do this in practice.
Between these two options, I think serialize into vector, calculate size, and then write the vector into the binary probably makes the most sense. |
Feature description
For use cases described in this comment, need to have the capability to truncate the decorator data when serializing MAST.
This comment describes one approach for implementing this feature.
Should probably introduce CLI flag for the compile command to write the compiled program with truncated optional data.
Why is this feature needed?
Per the comment on Pull Request #1531 :
... general idea is that producers, e.g. the compiler, will emit a fully-fleshed out package containing all the things that could be useful for interacting with that package (e.g. debug info, documentation, sources in some caes, etc.), and the MAST contained in the package would have all of the decorators (naturally required to support debugging, etc.). There are contexts where all of that stuff is simply dead weight though, or you want to optimize for binary size, in which case you can strip all of that, resulting in a minimal package that only contains exactly what is needed to execute what it contains. One such potential use case, would be publishing packages on-chain, which would only be feasible if they are as small as possible.
The text was updated successfully, but these errors were encountered: