-
Notifications
You must be signed in to change notification settings - Fork 310
Uplift Utf8Bytes #783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Uplift Utf8Bytes #783
Conversation
5b5e760
to
9a523a7
Compare
FWIW I think this is a good idea. Other than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a small nitpick.
I'm not sure what the procedure is here, but would it be possible to get another review on this? |
I think it would be good for bytes to provide this, but we need to get the API perfectly right on first try when adding it to the bytes crate as breaking changes are impossible. It may be better to start out by placing this in a downstream "blessed" crate and moving it into bytes later. |
I do think that seems reasonable, but also, almost the whole API proposed here is basically a subset of that of |
I could buy that argument, but you are innovating in the name of the struct. ;) |
To add a counter point, I have "needed" such a struct in several crates, but it hasn't bothered me at all to include the tiny amount of code to make a private struct. So I haven't felt a need for a unified type in the ecosystem. |
I mean, I can see the point that downstream crates would like to avoid using unsafe, which you need to make a nice |
FWIW, I'd generally agree with this - I've not needed this functionality all that often, and when I have I've been content just slapping a dependency on
I'm running into those issues now in a project; I'm trying to convert between |
True 🙃 I'm just not sure how a semi-official crate will help bikeshed the name; some sort of RFC to the community that's posted in TWIR might be more likely to help work out kinks in the name/get opinions on it. Or, it'd be probably even better to just be a top-down decision made by the same people who finalized the API names for bytes 1.0. |
That's true, I don't think a downstream crate helps on the struct name issue. The name I like is |
What about the module? |
I guess it is worth to consider what they are going to do in rust-lang/rust#134915 (comment). |
This adds a Bytes wrapper with a UTF-8 validity invariant, uplifting the type found as:
bytes_utils::Str
(2.4M downloads/month)bytestring::ByteString
(1.7M downloads/month)tungstenite::Utf8Bytes
pub(crate) http::ByteStr
axum::extract::ws::Utf8Bytes
(itself a wrapper aroundtungstenite::Utf8Bytes
, so as to not depend on tungstenite in the public API)pub(crate) h2::hpack::ByteStr
async_nats::Subject
My approach here was to just copy all the API surface from
Bytes
that could be adapted (as well as a function based onString::from_utf8()
), but if this is too big to merge all at once, I could take out some of the methods and add them in a followup PR. I didn't directly copy anything from any of the aforementioned crates, in case that matters wrt licensing, though I did look a little bit at howbytestring
phrased their docs.The
Hash
impl delegates tostr as Hash
rather thanBytes as Hash
, meaning it's not hash-compatible withBytes
. This differs fromhttp::ByteStr
andtungstenite::Utf8Bytes
, but it's required, sinceUtf8Bytes
implementsBorrow<str>
(and so doestungstenite::Utf8Bytes
, which means that was probably an oversight).wrto the name:
Utf8Bytes
seemed to me the most straightforward description of what it is, and the axum case is what spurred me to open this.ByteString
seems like it could imply thatstd::string::String
isn't composed of bytes (or that it's something like bstr; not necessarily utf8), andBytesString
is a bit clunky with the geminated s.Str
is simple likeBytes
is, but doesn't match up withOsStr
orCStr
, which are slices and not owned buffers. (Edit: another option is justString
, with the idea that it would primarily be referred to asbytes::String
. That's probably a bad idea though). However, I'm not dead set on anything, and would be totally fine with renaming it if that's desired.