Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type char does not implement StableAbi #110

Open
gschulze opened this issue Aug 11, 2023 · 4 comments
Open

Type char does not implement StableAbi #110

gschulze opened this issue Aug 11, 2023 · 4 comments

Comments

@gschulze
Copy link

gschulze commented Aug 11, 2023

When trying to use char as an FFI type, the error below is thrown. According to the documentation, a char is always four bytes in size, so this should not be hard to support.

the trait bound `char: abi_stable::StableAbi` is not satisfied
the following other types implement trait `abi_stable::StableAbi`:
  &'a T
  &'a mut T
  ()
  *const T
  *mut T
  [T; N]
  abi_stable::DynTrait<'borr, P, I, EV>
  abi_stable::RMut<'a, T>
and 315 others
required for `[char; 2]` to implement `abi_stable::StableAbi`
2 redundant requirements hidden
required for `abi_stable::std_types::ROption<abi_stable::std_types::RVec<[char; 2]>>` to implement `abi_stable::StableAbi`
@rodrimati1992
Copy link
Owner

rodrimati1992 commented Aug 14, 2023

The reason I haven't implemented StableAbi for char is that it's not clear to me that its range of valid values is guaranteed to stay the same forever.

If char was extended to support values over '\u{10FFFF}', then passing it to dynamic libraries compiled in Rust versions prior to that range extension would potentially be Undefined Behavior.

@eggyal
Copy link

eggyal commented Aug 14, 2023

Reminds me of this IRLO discussion in which Josh Triplett said:

Unicode has pretty extensively committed to never expanding past U+10FFFF; that seems exceedingly unlikely to change. And if it did, we can expand Rust's Unicode support at that time.

Later in the discussion, scottmcm made the following points:

This would also require expanding the definition of safe strs to include not-valid-in-today's UTF-8 byte sequences. I don't think that's acceptable, as it'd immediately result in everyone trying to accept only Unicode-valid UTF-8 needing to run extra checks.

Not to mention that it's a breaking change to make char bigger anyway, since you can exhaustively match on char: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1d747466baf240a19236e4cdf8ba2168 1

pub fn foo(c: char) {
    let ('\0'..='\u{10FFFF}') = c;
}

So this isn't happening in Rust in the foreseeable future.

If Unicode gets to the point where they're talking about potentially needing something bigger, then we'll add new types for whatever that is.

Finally, for a more authoritative take, The Rust Reference states:

A value of type char is a Unicode scalar value (i.e. a code point that is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to 0xD7FF or 0xE000 to 0x10FFFF range. It is immediate Undefined Behavior to create a char that falls outside this range.

@gschulze
Copy link
Author

@rodrimati1992 hmm, I see. My particular use case was to represent ISO-639 country codes as [char; 2] - I can just regular strings instead. Anyway, do you think there is a chance this might be implemented given the arguments pointed out by @eggyal?

@eggyal
Copy link

eggyal commented Aug 14, 2023

My gut would be to use an enum for that, which is in fact what the isolang crate does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants