From fd4ad1a1fac301bc79b6de5572a3d69e22d32d8d Mon Sep 17 00:00:00 2001 From: Epicat Supercell Date: Fri, 23 Jun 2017 16:33:38 +0300 Subject: [PATCH 1/4] RFC - Zero-Sized references - first commit --- text/0000-zero-sized-references.md | 142 +++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 text/0000-zero-sized-references.md diff --git a/text/0000-zero-sized-references.md b/text/0000-zero-sized-references.md new file mode 100644 index 00000000000..6fa5d3b8024 --- /dev/null +++ b/text/0000-zero-sized-references.md @@ -0,0 +1,142 @@ +- Feature Name: zero_sized_references +- Start Date: 2017-06-23 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +References to Zero-Sized Types (both shared and mutable) have been historically the size of `usize`. +The proposed change is to make them also ZST. + +# Motivation +[motivation]: #motivation + +References to any type in rust are represented as a pointer. Usually the pointer is smaller and faster to move around. +However for Zero-Sized Types that only have a single value (for example `()` ) moving around is a no-op, and can be optimized away. +Reading and writing the value is a no-op since it has only a single value anyway and therefore it carries no extra information. +However, currently the compiler can't optimize away the pointer from data structures. + +Zero-Sized Types are useful for functions, lifetime guarantees and destructors, +and references to them can be used to show these types "exist" (for references) or "you are the only one using it" (for mutable references). +The actual value is meaningless and the representation should be optimized to be of size 0 as well. + +In addition, references to Zero-Sized Types often appear in polymorphic code, where they handle non-ZST as well. + +In both of these cases, it will be an advantage for the references to be Zero-Sized. + +# Detailed design +[design]: #detailed-design + +## Calculating size +[calculating-size]: #calculating-size + +Disclaimer: The writer of this RFC is not familiar with the inworkings of the compiler. + + +Finding if a reference points to a ZST or not may not always be trivial. +A struct with a reference to itself will not know it's size until after it knows the size to the reference. + +However, a couple of notes: + +```rust +struct A<'a> (&'a A<'a>); +``` +This struct cannot be instantiated, because the very first instance of it requires an instance of it to already exist. + +```rust +enum A<'a> { + ZeroSized, + SelfRef (&'a A<'a>), +} +``` +The moment an enum has more than a single value, it cannot be Zero-Sized. Otherwise it isn't different than a struct. + +Therefore, I propose to assume that whenever you find a self reference (or multiple types referencing in a loop), +decide the reference is not Zero-Sized, since there most likely WILL be other data somewhere in the chain. +This is most relevant to unions, which could have self-references and be instantiated at the same time. + +## *const and *mut pointers +[pointers]: #pointers + +It is possible to convert a reference to a pointer. Currently, a reference to ZST points to an arbitrary location, +and when converting to a pointer the pointer recieves that arbitrary location. + +After this change, the reference will not hold any data. I propose that whenever a ZST reference is converted to a pointer, +a warning/error be issued ("Warning: taking the address of a Zero-Sized Type is meaningless") and the pointer will recieve an address +with the same algorithm that assigned an address to the reference in our current implementation. + +The purpose is to not break current code that might do this. We probably don't want to assign Null since for pointers it has +a meaning that the value doesn't exist, which is different than "exist but no data". + +Converting in the other direction, the value of the pointer will be silently dropped - that value never had a meaning in the first place. + +# How We Teach This +[how-we-teach-this]: #how-we-teach-this + +For most rust users, this change will be invisible. Thier code will just become a tiny bit smaller. + +Users of unsafe rust might encounter this case. Therefore there should probably be a note in the nomicon that references might be optimized away for ZST. + +# Drawbacks +[drawbacks]: #drawbacks + +## Breaking code + +Any place that assumed a reference holds a pointer might introduce bugs. + +FFI might behave differently than now, breaking code (see Unresolved questions). + +# Alternatives +[alternatives]: #alternatives + +The system that is now works well, and does not have to be changed. + +# Unresolved questions +[unresolved]: #unresolved-questions + +## FFI + Escape mechanism + +Do we need a way to remove the optimization? For example, if we have +```rust +struct ExternalStruct; + +let x: &mut ExternalStruct = ffi_function(); +other_ffi_function(x); +``` +We might want to represent a pointer we got from FFI and checked it's not null as a reference to an object. +Since that object is not accessable directly we represent it as an empty struct. +However then the references to it are optimized away and we lose the pointer and then can't call the second function. + +Note: This break will not be silent. The safe wrapper for other_ffi_function will convert the reference back to pointer before +using it, raising the warning/error of converting references to pointers. + +For that case, we might want to mark `ExternalStruct` as Non-Zero-Sized. Possibly +```rust +impl !Sized for ExternalStruct; +``` +and maybe it should require `unsafe`. + +## Errors of conversion + +Do we want to give an error or a warning for converting references to pointers? + +The code above is an example where we break working code, so an error might be needed to show the significance. + +However, some conversions might be meaningless and wouldn't affect the execution, so the programmer might allow the conversion. + +## Mitigating breakage + +Safe rust code should be affected positively by this change. However unsafe code might break. + +FFI is especially vulnurable to this change, as shown above. +Are there better ways to deal with these errors without user involvement? + +## Specific examples - Pro + +The RFC isn't well-justified until it has at least one detailed use case where it helps. +Please share specific examples of code where Zero-Sized references are useful. + +## Specific examples against + +If you have specific examples where this change is detrimental, please share them. From fe66f6d910715f452b6271ef4b39b9f9f1ea4e44 Mon Sep 17 00:00:00 2001 From: Epicat Supercell Date: Fri, 23 Jun 2017 16:37:36 +0300 Subject: [PATCH 2/4] RFC - Zero-Sized references - formatting --- text/0000-zero-sized-references.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-zero-sized-references.md b/text/0000-zero-sized-references.md index 6fa5d3b8024..a133004a353 100644 --- a/text/0000-zero-sized-references.md +++ b/text/0000-zero-sized-references.md @@ -28,7 +28,7 @@ In both of these cases, it will be an advantage for the references to be Zero-Si # Detailed design [design]: #detailed-design -## Calculating size +### Calculating size [calculating-size]: #calculating-size Disclaimer: The writer of this RFC is not familiar with the inworkings of the compiler. @@ -56,7 +56,7 @@ Therefore, I propose to assume that whenever you find a self reference (or multi decide the reference is not Zero-Sized, since there most likely WILL be other data somewhere in the chain. This is most relevant to unions, which could have self-references and be instantiated at the same time. -## *const and *mut pointers +### *const and *mut pointers [pointers]: #pointers It is possible to convert a reference to a pointer. Currently, a reference to ZST points to an arbitrary location, @@ -81,7 +81,7 @@ Users of unsafe rust might encounter this case. Therefore there should probably # Drawbacks [drawbacks]: #drawbacks -## Breaking code +### Breaking code Any place that assumed a reference holds a pointer might introduce bugs. @@ -95,7 +95,7 @@ The system that is now works well, and does not have to be changed. # Unresolved questions [unresolved]: #unresolved-questions -## FFI + Escape mechanism +### FFI + Escape mechanism Do we need a way to remove the optimization? For example, if we have ```rust @@ -117,7 +117,7 @@ impl !Sized for ExternalStruct; ``` and maybe it should require `unsafe`. -## Errors of conversion +### Errors of conversion Do we want to give an error or a warning for converting references to pointers? @@ -125,18 +125,18 @@ The code above is an example where we break working code, so an error might be n However, some conversions might be meaningless and wouldn't affect the execution, so the programmer might allow the conversion. -## Mitigating breakage +### Mitigating breakage Safe rust code should be affected positively by this change. However unsafe code might break. FFI is especially vulnurable to this change, as shown above. Are there better ways to deal with these errors without user involvement? -## Specific examples - Pro +### Specific examples - Pro The RFC isn't well-justified until it has at least one detailed use case where it helps. Please share specific examples of code where Zero-Sized references are useful. -## Specific examples against +### Specific examples against If you have specific examples where this change is detrimental, please share them. From 7b75968615c475d67a66d3127da42c0a19e4d6f4 Mon Sep 17 00:00:00 2001 From: Epicat Supercell Date: Fri, 23 Jun 2017 16:42:58 +0300 Subject: [PATCH 3/4] RFC - Zero-Sized references - formatting pointers --- text/0000-zero-sized-references.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-zero-sized-references.md b/text/0000-zero-sized-references.md index a133004a353..f146d22786d 100644 --- a/text/0000-zero-sized-references.md +++ b/text/0000-zero-sized-references.md @@ -56,7 +56,7 @@ Therefore, I propose to assume that whenever you find a self reference (or multi decide the reference is not Zero-Sized, since there most likely WILL be other data somewhere in the chain. This is most relevant to unions, which could have self-references and be instantiated at the same time. -### *const and *mut pointers +### `*const` and `*mut` pointers [pointers]: #pointers It is possible to convert a reference to a pointer. Currently, a reference to ZST points to an arbitrary location, From e6e223923c7aefa07d6fdd8b12c13e20d46d3ac2 Mon Sep 17 00:00:00 2001 From: Epicat Supercell Date: Fri, 23 Jun 2017 23:01:49 +0300 Subject: [PATCH 4/4] RFC - Zero-Sized references - updating concerns --- text/0000-zero-sized-references.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/text/0000-zero-sized-references.md b/text/0000-zero-sized-references.md index f146d22786d..94e1181a127 100644 --- a/text/0000-zero-sized-references.md +++ b/text/0000-zero-sized-references.md @@ -1,6 +1,6 @@ - Feature Name: zero_sized_references - Start Date: 2017-06-23 -- RFC PR: (leave this empty) +- RFC PR: [#2040](https://github.com/rust-lang/rfcs/pull/2040) - Rust Issue: (leave this empty) # Summary @@ -74,7 +74,7 @@ Converting in the other direction, the value of the pointer will be silently dro # How We Teach This [how-we-teach-this]: #how-we-teach-this -For most rust users, this change will be invisible. Thier code will just become a tiny bit smaller. +For most rust users, this change will be invisible. Their code will just become a tiny bit smaller. Users of unsafe rust might encounter this case. Therefore there should probably be a note in the nomicon that references might be optimized away for ZST. @@ -95,6 +95,10 @@ The system that is now works well, and does not have to be changed. # Unresolved questions [unresolved]: #unresolved-questions +### Definition of `&` + +Is the definition of `&` state that we guarantee it's a pointer? Or do we only promise you can use it to access the data? + ### FFI + Escape mechanism Do we need a way to remove the optimization? For example, if we have @@ -132,6 +136,9 @@ Safe rust code should be affected positively by this change. However unsafe code FFI is especially vulnurable to this change, as shown above. Are there better ways to deal with these errors without user involvement? +(Unsafe code might break - are there examples of VALID code that breaks? Or does only invalid uses of references break? +And what is our stance on breaking invalid code, assuming there is a large amount of it?) + ### Specific examples - Pro The RFC isn't well-justified until it has at least one detailed use case where it helps.