Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain why this "Reed-Solomon' differs so much from storage ones #17

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

burdges
Copy link

@burdges burdges commented Jan 25, 2024

Although much faster for high shard counts, this crate provides rather different functionality from classical Reed-Solomon built upon syndrome decoding, etc. It's likely worth some explination.

@AndersTrier
Copy link
Owner

AndersTrier commented Jan 29, 2024

Hi Jeff

Thanks for the PR! You're right, that should be documented somewhere.

I think the explanation is a bit too verbose though.

How about just stating something like:

This crate does not detect or correct errors within a shard. I.e if data corruption is a likely scenario, you should implement a hash check for each shard, and don't feed the invalid shards to the decoder. Some suggestions for very fast error detection hashes are CRC32c (4 bytes), HighwayHash (8, 16 or 32 bytes) or xxHash (4, 8 or 16 bytes).

That's also more aligned with what Klaus documents in his implementation:

The encoder does not know which parts are invalid, so if data corruption is a likely scenario, you need to implement a hash check for each shard.
If a byte has changed in your set, and you don't know which it is, there is no way to reconstruct the data set.

@burdges
Copy link
Author

burdges commented Jan 29, 2024

As you like of course.. I sent some contact details by email, so maybe worth a brief off-line chat about langauge.

I typically prefer brevity like Klaus comment, but here I'm really addressing the concern raised in your PR to reed-solomon-16, by distancing this crate from matrix multiplication and error location, correction, and checking.

@AndersTrier
Copy link
Owner

here I'm really addressing the concern raised in your PR to reed-solomon-16

This PR malaire#1? What concerns did I raise in that PR that you're addressing?

Note on hash now merged: #18
IMO that's sufficient. If someone is interested in the algorithm, there's a link to https://github.com/catid/leopard in the readme which references the relevant papers.

@FallingSnow
Copy link

I think the long version would go well in the wiki, if that's a feature you want to enable for this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants