[SUGGESTION] Named return values can possibly make out
parameters redundant
#626
Replies: 32 comments 25 replies
-
Cpp2 already has what you're suggesting:
So what would be left of your suggestion is: "remove EDIT: Actually, the above can be done with |
Beta Was this translation helpful? Give feedback.
-
Actually, I don't think the current implementation of named return values can guarantee copy elision. Herb has previously stated his intention to support named parameters. |
Beta Was this translation helpful? Give feedback.
-
Yeah, the big difference between this proposal and the current semantics for named return values in cpp2 is that in the latter case, the return variables are not interpreted as references to a memory location provided by the caller. This is a missed opportunity, IMO. But once you make that adjustment, |
Beta Was this translation helpful? Give feedback.
-
Thanks! Note that Cpp2 does have two different features here (both of which Cpp1 also has, but here they're generalized with more language support and the ability to declare intent):
They are similar but do have different use cases. I agree that more often you would just use return values, but Doing more for copy elision in the implementation of multiple return values is interesting though, good suggestion. |
Beta Was this translation helpful? Give feedback.
-
@hsutter Thank you for your reply! Your comment seems to mostly focus on describing how Do you not believe it is possible to merge the two constructs into one, as I have proposed? It looks very possible to me. |
Beta Was this translation helpful? Give feedback.
-
I don't see either what the difference in use cases is. In the caller I can assign the return value of a function to either an initialized variable or an uninitialized one, same as I can pass either as an out argument. In the callee, the body has to produce a new value for each out parameter, same as for the returned one. |
Beta Was this translation helpful? Give feedback.
-
I believe the important aspect is that named return values have a fixed spot in memory, their scope starts and ends within the scope of the function call.
Out parameters may be declared in another scope, and initialised with a function call within a smaller scope.
On 14 July 2023 00:54:50 Jorge Canizales ***@***.***> wrote:
I don't see either what the difference in use cases is.
In the caller I can assign the return value of a function to either an initialized variable or an uninitialized one, same as I can pass either as an out argument.
In the callee, the body has to produce a new value for each out parameter, same as for the returned one.
—
Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQNN7VP42YIGGDBTKKTXQCDEPANCNFSM6AAAAAA2IHSZYA>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I think you're mixing up the notions of variables and memory locations:
So there is no distinction there. |
Beta Was this translation helpful? Give feedback.
-
Surely the stack location where the out parameter is declared can be in a longer lived scope than the L value created by the return. Won't this affect where the stack portion of the values memory exists, can't this prevent potential move or copy assignments?
On 14 July 2023 08:28:09 Nick Smith ***@***.***> wrote:
named return values [...] their scope starts and ends within the scope of the function call. Out parameters may be declared in another scope.
I think you're mixing up the notions of variables and memory locations:
* The scope of both NRVs and out parameters (the variables) is that of the callee's function body.
* With guaranteed NRVO, the memory locations that NRVs and out parameters refer to are provided by the caller.
So there is no distinction there.
—
Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQJVMVZIU7G4ZIK7DFDXQDYIDANCNFSM6AAAAAA2IHSZYA>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I'm not sure exactly what you're asking, but I can give you a blanket answer: a function that declares a named return value will behave exactly how a function that declares an In my original post, I mentioned that what I am proposing is essentially just to adjust the syntax of So by definition, the two features will behave the same. |
Beta Was this translation helpful? Give feedback.
-
How about interoperability with c++1? Suppose I need to call some API function with |
Beta Was this translation helpful? Give feedback.
-
I'm not sure what you mean. C++1 doesn't have |
Beta Was this translation helpful? Give feedback.
-
outerScopeString: string; // uninitialised
{
stringInitFunc(outerScopeString); // via out param
}
//now have an initialised string in another scope with no move or copy
Vs
{
lValueString = stringInitFunc()
// must move or copy this string elsewhere or it will go out of scope and destruct
}
These are different capabilities that maybe important in non-trivial use cases
On 14 July 2023 08:45:51 Nick Smith ***@***.***> wrote:
I'm not sure exactly what you're asking, but I can give you a blanket answer: a function that declares a named return value will behave exactly how a function that declares an out parameter will behave.
In my original post, I mentioned that what I am proposing is essentially just to adjust the syntax of out parameters such that they look like (and compose like) return values.
So by definition, the two features will behave the same.
—
Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQP52F5Q3KZLEGXUY73XQD2KZANCNFSM6AAAAAA2IHSZYA>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
@SebastianTroy Irrespective of whether the function accepts an
With NRVO, this doesn't require any copying or moving. So it behaves exactly the same as an |
Beta Was this translation helpful? Give feedback.
-
It seems like #123 touches upon this, too:
|
Beta Was this translation helpful? Give feedback.
-
@hsutter Yes, I think the code transformation would work roughly as you describe. But you wouldn't need to apply it to every cpp2 function definition and call. You'd only need to apply it to definitions that use named return values:
And you're right: this implies that the cpp2 compiler needs to be able to tell which calls are to NRV functions. That said, GCC and Clang have recently implemented guaranteed NRVO (following P2025), so if you're compiling your C++ code using those compilers, you can do a much simpler code transformation: declare the return variable in the first line of the function body, rather than in the signature:
This transformation is easier to implement, because it means that call sites don't need to be altered. Of course, if you need this transformation to guarantee that no copies/moves are performed, P2025 (or something similar) would need to be standardized. Until then, the only transformation that is portable is the one you described. |
Beta Was this translation helpful? Give feedback.
-
I think this is a more correct translation: // translated into cpp1, assuming NRVO is available
struct cpp2_func__ret {
widget w;
}
auto cpp2_func() -> cpp2_func__ret {
auto res = cpp2_func__ret{foo()};
return res; // We also need to make sure that returns are explicit
} Although to not require that all data members are initialized at the same time
// translated into cpp1, assuming NRVO is available
struct cpp2_func__ret {
union { widget w; };
union { widget v; };
}
auto cpp2_func(bool const& b) -> cpp2_func__ret {
auto res = cpp2_func__ret{};
res.w = foo();
stuff();
if (b) {
res.v = bar();
} else {
res.v = baz();
}
return res; // We also need to make sure that returns are explicit
} Different from today is that those initializations are not runtime checked. |
Beta Was this translation helpful? Give feedback.
-
FWIW, this works (https://compiler-explorer.com/z/oKEd86qfY): #include "https://raw.githubusercontent.com/hsutter/cppfront/main/include/cpp2util.h"
#include <cassert>
namespace cpp2 {
template<typename T>
class out_member {
T* t;
// Each out in a chain contains its own uncaught_count.
int uncaught_count = Uncaught_exceptions();
bool called_construct = false;
public:
out_member(T* t_) noexcept : t{t_} { Default.expects(t); }
// In the case of an exception, if the parameter was uninitialized
// then leave it in the same state on exit (strong guarantee)
~out_member() {
if (called_construct && uncaught_count != Uncaught_exceptions()) {
std::destroy_at(t);
}
}
auto construct(auto&& ...args) -> void {
if constexpr (requires { std::construct_at(t, CPP2_FORWARD(args)...); }) {
Default.expects( t );
std::construct_at(t, CPP2_FORWARD(args)...);
}
else {
Default.expects(!"attempted to copy assign, but copy assignment is not available");
}
}
auto construct_list(auto&& ...args) -> void {
if constexpr (requires { std::construct_at(t, T{CPP2_FORWARD(args)...}); }) {
Default.expects( t );
std::construct_at(t, T{CPP2_FORWARD(args)...});
}
else {
Default.expects(!"attempted to copy assign, but copy assignment is not available");
}
}
auto value() noexcept -> T& {
Default.expects( t );
return *t;
}
};
}
struct widget {
std::string x;
~widget() { }
};
widget foo() { return {"1"}; }
widget bar() { return {"2"}; }
widget baz() { return {"3"}; }
void stuff() { }
struct cpp2_func__ret;
// Reenable structured bindings support.
#include <tuple>
template<> struct std::tuple_size<cpp2_func__ret> : std::integral_constant<int, 2> { };
template<> struct std::tuple_element<0, cpp2_func__ret> : std::type_identity<widget> { };
template<> struct std::tuple_element<1, cpp2_func__ret> : std::type_identity<widget> { };
struct cpp2_func__ret { // No longer an aggregate.
union { widget w; }; // Now an anonymous union member.
union { widget v; };
private: // For `cpp2_func` access.
friend auto cpp2_func(bool const& b) -> cpp2_func__ret;
cpp2_func__ret() { }
private:
template<class T> void move_construct(T&& that) noexcept(std::is_rvalue_reference_v<T&&>) {
auto _w = cpp2::out_member<widget>{&w};
auto _v = cpp2::out_member<widget>{&v};
_w.construct(std::move(that).w);
_v.construct(std::move(that).v);
}
template<class T> cpp2_func__ret& move_assign(T&& that) noexcept(std::is_rvalue_reference_v<T&&>) {
w = std::move(that).w;
v = std::move(that).v;
return *this;
}
public:
cpp2_func__ret(const cpp2_func__ret& that) { move_construct(that); }
cpp2_func__ret(cpp2_func__ret&& that) noexcept { move_construct(std::move(that)); }
cpp2_func__ret& operator=(const cpp2_func__ret& that) { return move_assign(that); }
cpp2_func__ret& operator=(cpp2_func__ret&& that) noexcept { return move_assign(std::move(that)); }
~cpp2_func__ret() {
std::destroy_at(&v);
std::destroy_at(&w);
}
void f() {
auto[a,b] = *this; // Test that structured bindings works at type scope.
}
// Reenable structured bindings support.
template<int I, class T> requires (I==0) friend auto get(T&& x) -> decltype((CPP2_FORWARD(x).w)) { return CPP2_FORWARD(x).w; }
template<int I, class T> requires (I==1) friend auto get(T&& x) -> decltype((CPP2_FORWARD(x).v)) { return CPP2_FORWARD(x).v; }
};
auto cpp2_func(bool const& b) -> cpp2_func__ret {
auto __res = cpp2_func__ret{};
auto w = cpp2::out_member<widget>{&__res.w};
auto v = cpp2::out_member<widget>{&__res.v};
w.construct(foo());
stuff();
if (b) {
v.construct(bar());
} else {
v.construct(baz());
}
return __res;
}
int main() {
auto [w, v] = cpp2_func(true);
assert(w.x == "1");
assert(v.x == "2");
} I don't know what else could be broken by switching to anonymous union members. |
Beta Was this translation helpful? Give feedback.
-
Summarizing Herb's #540 (comment):
Although the issue author's reply #540 (comment) is:
Because (N)RVO happens via the return value,
Whereas on the
|
Beta Was this translation helpful? Give feedback.
-
A follow-up that's pertinent to the multiple named return values case, is that it opens the language syntax to the possibility of splatting when chaining function calls. This is something other languages have had for a while (e.g. in Julia: |
Beta Was this translation helpful? Give feedback.
-
I've been having similar thoughts recently about out parameters vs. return values. I don't expect this idea to be popular, but I'm leaning towards what Herb thought about earlier, that the way forward is no return values, just out parameters. I think it appropriately reduces complexity, unifies existing concepts, and does a better job of expressing the inputs vs. the outputs of a function by defining them in the same space. By only using // Starting point:
func: (in a: int, out b: int) = { b = a * 5; }
// Don't need to name the return type, omit it:
func: (in a: int, out _: int) = { return a * 5; }
// Don't need the statement, we're only returning one expression:
func: (in a: int, out _: int) = a * 5;
// We're going to write many functions where we don't care to name the return value,
// why force ourselves to write a discard identifier every time? Omit it:
func: (in a: int, out: int) = a * 5;
// Return type can be deduced, omit it:
func: (in a: int, out) = a * 5; To handle calling syntax, let's invent Universal Function Return Syntax (UFRS) (no idea if this has already been thought of, props to whoever did), which says that
I think for larger function calls, it does a great job of laying out what the inputs and outputs of the function are. It's nice that
As for cpp1 generation, there's a few different ways I could imagine this working, either emitting out parameters as return values and structs instead of out parameters, or making a |
Beta Was this translation helpful? Give feedback.
-
Ultimately though, I don't think it matters whether we toss out The most pressing issue that makes this worth talking about is the fact that we have two very different language constructs that express the exact same intent (output of a function) for both the function author and function caller, and the differences between them are usage-specific micro optimizations that I think cppfront can find ways of detecting and fixing automatically without additional cognitive overhead. |
Beta Was this translation helpful? Give feedback.
-
Good point. Do you mean we may put cls1: type = {
// declarations...
// : (inout this, in a, in b) -> (r)
fnc1: (inout this, out r, in a, in b) = {
r = a + b;
}
// declarations...
}
var1: cls1 = ();
var2: = var1.fnc1(0, 1);
// Parameters:
// this = var1
// r = [temporary object r]
// a = 0
// b = 1
// Assignment:
// var2 = r
var1.fnc1(out var2, 0, 1);
// Parameters:
// this = var1
// r = var2
// a = 0
// b = 1 So we can think in this way, if cls1: type = {
// Multiple `out` parameters (return values)
operator=: (out this, out a, out b, in x, in y, in z) = {
// statements...
}
// `out m` cannot be implicitly a return value!
fnc1: (inout this, out a, in x, out m) = {
// statements...
}
}
(var1, a1, b1): cls1 = (0, 1, 2);
// Parameters:
// this = [temporary object this]
// a = [temporary object a]
// b = [temporary object b]
// x = 0
// y = 1
// z = 2
// Assignment:
// var1 = this
// a1 = a
// b1 = b
m1: i32;
var2: = var1.fnc1(0, out m1);
// Parameters:
// this = [temporary object this]
// a = [temporary object a]
// x = 0
// m = m1
// Assignment:
// var2 = a In the last line, we have to write |
Beta Was this translation helpful? Give feedback.
-
I think your idea (having Also it easily solves the surprising fact that why UFCS on What if we remove return values in favor of NAME: (<this-param>, <out-params>..., <non-out-param>, <rest-params>...)
= {
/* statements... */
} In which we have the followin parameters:
By making Now let's explore what can be done in this way: example: type = {
// It's the constructor with multiple `out` parameters (return values).
operator=: (out this, out a, in x, out b) = {
// statements...
}
}
b1: i32;
// Here, we don't set `a` parameter explicitly.
(var1, a1): example = (0, out b1);
// Parameters:
// this = [temporary object]
// a = [temporary object]
// x = 0
// b = b1
// Assignment:
// var1 = this
// a1 = a
// Here, we set `a` parameter explicitly.
var2: example = (out a1, 0, out b1);
// Parameters:
// this = [temporary object]
// a = a1
// x = 0
// b = b1
// Assignment:
// var2 = this For example, the signature of cls1: type = {
// declarations...
// : (inout this, in that) -> (r: cls1)
operator+: (inout this, out r: cls1, in that) = {
r = cls1();
r.value = this.value + that.value;
}
// declarations...
}
a: cls1 = ();
b: cls1 = ();
c: _ = a + b; IMO this change is worth it. |
Beta Was this translation helpful? Give feedback.
-
Return values can be either fnc1: () -> ( r: i32) = { ... }
fnc2: () -> (move r: i32) = { ... }
fnc3: () -> (forward r: i32) = { ... } We may make Hence // : () -> (r: i32) = { ... }
fnc1: (out r: i32) = { ... }
// : () -> (move r: i32) = { ... }
fnc2: (out r: i32) = { ... } But we still need // : () -> (forward r: i32) = { ... }
fnc3: (forwardout r: i32) = { ... } IMO the name of |
Beta Was this translation helpful? Give feedback.
-
But from the readability point of view, IMO return types are better than Considering cls1: type = {
operator=: (out this, out a, out b, x, y) = {
// statements
}
fnc1: (inout this, out a, out b, x, y) = {
// statements
}
} This equal return types are more expressive: cls1: type = {
// `out this` is changed to be a return value.
// It reduces concept count. It's the point of this topic here.
operator=: (x, y) -> (this, a, b) = {
// statements
}
fnc1: (inout this, x, y) -> (a, b) = {
// statements
}
} So Although we write them as return values, but the concept still can be the same as Because we can explicitly pass (var1, a1, b1): cls1 = (0, 1);
// Parameters:
// this = [temporary object]
// a = [temporary object]
// b = [temporary object]
// x = 0
// y = 1
// Assignment:
// var1 = this
// a1 = a
// b1 = b
(var1, a1): cls1 = (out b1, 0, 1);
// Parameters:
// this = [temporary object]
// a = [temporary object]
// b = b1
// x = 0
// y = 1
// Assignment:
// var1 = this
// a1 = a
var1: cls1 = (out a1, out b1, 0, 1);
// Parameters:
// this = [temporary object]
// a = a1
// b = b1
// x = 0
// y = 1
// Assignment:
// var1 = this
cls1(out var1, out a1, out b1, 0, 1);
// Parameters:
// this = var1
// a = a1
// b = b1
// x = 0
// y = 1
(a1, b1): = var1.fnc1(0, 1);
// Parameters:
// this = var1
// a = [temporary object]
// b = [temporary object]
// x = 0
// y = 1
// Assignment:
// a1 = a
// b1 = b
a1: = var1.fnc1(out b1, 0, 1);
// Parameters:
// this = var1
// a = [temporary object]
// b = b1
// x = 0
// y = 1
// Assignment:
// a1 = a
var1.fnc1(out a1, out b1, 0, 1);
// Parameters:
// this = var1
// a = a1
// b = b1
// x = 0
// y = 1 It's suggested by @nmsmith but in a way that we can optionally pass them as |
Beta Was this translation helpful? Give feedback.
-
ImplementationThere may be many ways to achieve this unification. Here's my 2 cents. Return Types are
|
Beta Was this translation helpful? Give feedback.
-
Alternative syntax instead of
|
Beta Was this translation helpful? Give feedback.
-
In C++, a semantic difference between out parameters and return values, that I hadn't considered before, is that the the types of arguments are an input to overload resolution, while the return type is an output. I haven't thought about how that could affect cpp2. |
Beta Was this translation helpful? Give feedback.
-
Alternative SyntaxIt has less parentheses. It removes the need of fnc1: (a: i32, b: i32 -> x: i32, y: i32) = { x = a + b; y = 0; }
// It's equal to:
// fnc1: (a: i32, b: i32) -> (x: i32, y: i32) = { x = a + b; y = 0; }
fnc2: (a: i32, b: i32 -> x: i32) = { x = a + b; }
// It's equal to:
// fnc2: (a: i32, b: i32) -> (x: i32) = { x = a + b; }
fnc3: (a: i32, b: i32) = {}
// It's equal to:
// fnc3: (a: i32, b: i32) = {} When we call functions: (x, y): = fnc1(0, 1);
fnc1(0, 1 -> x, y); // similar to declaration
// It's equal to:
// fnc1(0, 1, out x, out y);
x: = fnc2(0, 1);
fnc2(0, 1 -> x); // similar to declaration
// It's equal to:
// fnc1(0, 1, out x);
fnc3(0, 1); Also declaration syntax is consistent with use syntax. |
Beta Was this translation helpful? Give feedback.
-
Motivation
An
out
parameter works as follows:This is a very useful pattern, but notably, this is very similar to how return values already work in most ABIs. (In most ABIs, whenever a return type requires more than a few registers, the caller must provide a pointer to the memory location that the return value should be written to.)
Consequently, I propose making a slight tweak to
out
parameters to unify them with the notion of return values. The end result would be that cpp2 has one fewer concept, without sacrificing expressive power or performance. In fact, it will likely make cpp2 programs more performant, because the proposed solution implies NRVO.The proposal
In today's design,
out
parameters look like this:I am proposing to write them like this instead:
The differences are:
out
parameters are moved to the right side of the->
. In other words, they are treated as part of the return type.s1, s2 = foo()
, instead offoo(s1, s2)
.But notably, the compilation strategy remains the same:
Basically, what we end up with is an intuitive syntax for guaranteed NRVO. Accordingly, it should be possible to return immovable types (e.g. std::atomic) via this approach. (Because in the style of C++17, we're not "optimizing away" a move. Instead, we're saying that no moves are required.) Ultimately, this is equivalent to
out
parameters—all we've done is change the syntax.The likely benefits of this approach include:
g(h(...), j(...))
), but without
parameters (where functions returnvoid
) it is not.out
parameters. Instead, we would just need to allow return values to be given names. (Indeed, cpp2 already has syntax for this.)Beta Was this translation helpful? Give feedback.
All reactions