Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To decide: Should calls to evaluate also return bindings? #234

Closed
OAGr opened this issue Apr 12, 2022 · 11 comments
Closed

To decide: Should calls to evaluate also return bindings? #234

OAGr opened this issue Apr 12, 2022 · 11 comments
Labels
Language Regarding Squiggle language semantics, distributions and function registry

Comments

@OAGr
Copy link
Contributor

OAGr commented Apr 12, 2022

On this PR, @Hazelfire mentioned:

"I think I'd like to see a couple of critical features that might require you to give up on the concept of lisp like purity within the reducer"

I think it's critical that calls to evaluate also return the bindings that were created during that expression. This is because I want to share these variables between different executions. I think evalWBindings will be the main part of the API. But to get the bindings object from previous executions it needs to return the bindings at some point

On the other hand, this would break some purity. Arguably, these functions could just return records with all the intended bindings, like so:
{ x=1; y=2; %{x: 1, y: 2}}
or
{ ... {firstDistribution: firstDistribution, secondDistribution: secondDistribution} }
(Examples from @umuro)

@umuro
Copy link
Contributor

umuro commented Apr 12, 2022

For a language to be consistent it has to have consistent semantics not based on context. As far as I see from the PR comments. You are not designing a functional language but a script evaluator that is changing the values of some global variables ad-hoc. So I will suggest some functional semantics to understand what you want.

Case of returning nothing
{ x=1; y=2 }
Letting this be accepted and hunting variables in the bindings is the wrong approach. Because bindings are exposing all the internals of the reducer. And we have lazy evaluation in the mix. Internal variables are not always reduced to values!
The right approach would be to define a semantic rule. And we can easily define a semantic rule that does not violate purity.
Let the above block to be semantically equivalent to
{ x=1; y=2; %{x: 1, y: 2}}
And you can explain in the user manual that returning nothing is actually constructing a record of all global variables.

Expressions that are not assigned

{ doSth; 
  2;  }

The right semantic for this

{ _ = doSth; 
  2; }

This is possible. Because even in functional languages we need to create side effects sometimes. But that reads as "I want to create a side-effect". Is it really what you want? Side effects that are changing things that are not visible in the script! There is something that smells there. Think twice. Think quadruple... Maybe there is another way that can do it explicitly.

An example of getting rid of no assignment statements and making them explicit
If you get side effects here and there then it can get out of control easily. I am doing even expression logging and debugging within functional realm.

 x= 1 + 2
 log("x"++x) // bad bad bad

x = inspect(1+2, ~label="x: ") // Good boy

Returning multiple results
Well, we have arrays and records

{ ...
[firstDistribution, secondDistribution]
}

or

{ ...
{firstDistribution: firstDistribution, secondDistribution: secondDistribution}
} 

A note about semantic rules.
In doSth vs _ = doSth, I am not telling you have to write it that way but the semantic rule can covert it on the way.
Instead of hard-coding meaning into the parser, I defend external semantic rules so that we can judge the consistency of them. The other way you get an ultra-fat undebugable parser.

A typical semantic rule example
[1, 2, 3] is (cons 1 (cons 2 ( cons 3 emptyList)))
See, nobody is changing the actual interpreter code to add array construction to their languages

And with explicit semantic rules, we can explain the user how the language works... Instead of saying this is what you write but this what we do behind the scenes.

An actual good case of returning nothing in a functional language
I love TradingView. It allows me to write consistent trading strategies using PineScript. PineScript is a functional scripting language (read as there are no side effects). There they allow you write scripts without any return values. That is actually constructing a record of all variables in the script. And that is very useful. Because in strategy scripts it is normal to construct and return 10-20 different things. And it would be too much hassle to return them explicitly.
In their UI, they simply execute aRecord = yourscript; aRecord.keys->map { eachKey | aRecord[eachKey]->display}
Functional languages have syntaxtic sugar and semantic rules for the convenience of the user.
If there is a feature and you cannot find a semantic rule for it and you have to redesign the parser that means it is no longer functional. And you will be lost in the hell of side effects.
A good case for auto constructing record of variables: Making function modules
@Sam Nolan reminded me: "What if we define a script of functions only?" That's a script of only assignments without record.
{ fn1 = ...; fn2=... }
Adding the semantic rule to construct a record of variables solves this case.

However, if you go this way no body should map the function internals to typescript or try to access them!!! Because they contain wild unreduced code.

There is always a solution without breaking purity.
Always keep in mind, code with side effects is unrepeatable, untestable code.

Let's see what you need.

@Hazelfire Hazelfire mentioned this issue Apr 12, 2022
@Hazelfire
Copy link
Collaborator

Just some context about why I think this is the most important feature after #226. I'm personally very excited about reducer, but I'm also excited that we might be able to get back to Squiggle Notebooks again. The reason why I am talking to you about returning bindings from calls is because it would be great if something like below (where donation_size is referenced in a succeeding cell) could be done with reducer.

image

I am extremely stupid, and I usually think of the dumbest and simplest way to do things, so in my mind the easiest way would be return the bindings at the end of the execution call, and that if that was done, I could get reducer style squiggle notebooks?

@Hazelfire
Copy link
Collaborator

Hazelfire commented Apr 12, 2022

I mean like, internally, I could imagine that every code block returns something like:

x = 2
y = 3
x + y
x - y

Actually ends up returning:

x = 2
y = 3
%{exports: [x + y, x - y], bindings: %{x: x, y: y}}

With a way to pass the bindings back in as context would be fine. I don't think that breaks purity? If that would be an issue, could you tell me what type of semantics the above notebook ends up having?

Also now you are enabling such stupid scripts with nameless array of exports.
{
1; 4; 5; 1999; "hello; 1267
}
%{exports: [1; 4; 5; 1999; "hello; 1267], bindings: %{}}

And you are burdening the runtime memory usage. Because people can just write long scripts although they wont use the unassigned values there. Memory nightmare.

Hmm one moment. There is no need to give users an array constructor. They can always write a block. Shouldn't all languages be doing that? :)

@umuro
Copy link
Contributor

umuro commented Apr 12, 2022

I think just a record of all variables is just what you need. (PineScript does it).
You want your variables. There they are. It would be doing what you want from the actual bindings.

{
x = 2
y = 3
} == {x: 1, y: 3}

And just name x+y, x-y so that they are there alo. Otherwise this practice leads to many bugs... Nameless things leads you to make indexing mistakes. Returning an array of exported values is asking for trouble.

@umuro
Copy link
Contributor

umuro commented Apr 12, 2022

The variable record and actually returning bindings have a huge difference. Because then every expression has to return a tuple of bindings. x = 1 + 2 becomes (x, bindingsNow) = (bindings, 1)->{ prevBindings | (prevBindings, 2)->add
And we end up writing all functions to accept values paired with bindings, etc.
Then to prevent that we write 2 parsers. One that executes inner bloks. And a special one just for the outer block. Etc.
Don't insist on the bindings and be OK with a record of globals otherwise

  • inner blocks will not be possible easily
  • runtime code size will grow too much
  • It will even be more unreadable
  • You will have debugging difficulties on your long scripts. If there won't be long scripts then why are we making a language? We are making a language not a statement list executer.

Because in the future people will be able to write

{
x = { y=1;  y+1}
y = inject [1,2,3], { (acc, elem) = acc + elem) }
}

Instead of getting a record of variables then the code above becomes

{
(x, _bindings1) = { y=1;  y+1}
(y, _bindings2) = inject [1,2,3], { (acc, elem) = {(n, bindings) = acc ;  acc + elem)} }
}

To be able to return actual bindings blocks we end up making every expression returning bindings.

Therefore a block is a scope that returns one thing. And returning a record of all variables fall into that category without streching.

This way, there are no exceptional cases in the language and I can just write

{
r = {x = 1; y = 2}
q = r.x + r.y
}

GET A RECORD OF VARIABLES.

That said, now I am thinking about removing the array constructor {a: 1, b: 2} all together from the next parser. There is no need. Every block can return a record anyway...

AND
I dont want to see such a code in the interpreter
(answer, bindings, exports) = aBlock->reduce( (void, emptyBindings, emptyExports), statement => (_, newBinding, newExport) = exec statement; (newAnswer, newBinding++bindings, newExport++exports)}
You are really defending to turn the interpreter into this via exports and bindings. Therefore get a record of globals again. Not the bindings.

Anyway I will return a record of globals and start calling it bindings to apeace you :)
To avoid writing

  • Separate outer and inner block interpreters in addition to an expression interpreter.

And the actual bindings will have extra variable you've never declared. Because of optimization in the future. Example. {y = x+1+2; z=x+1+3}. Optimizer can turn this into {tmp1 = x+1; y=tmp1+2; z = tmp1+3}. And the order of execution of statements will change also for the sake of optimization. You will get your anonymous exports in a random order!

A language where changing the order of statements changes the results is not a functional language anymore! It is as good as trashing it and returning to genuine javascript.

From now on I am calling a record of global variables, bindings!

@umuro
Copy link
Contributor

umuro commented Apr 12, 2022

Now I will propose a new perspective to test what you want.
To achieve what you want, there is another way. Just use javascript itself.
Only a simple text processor is enough to make it into what you describe.
{ x=1; y=2; x-y; x+y}
can be converted by a text processor into
{ exports = []; x=1; addToBindings 'x', x; y=2; addToBindings 'y', y; addToExports x-y; addToExports x+y}
Sprinkle it with a few extra functions (map, reduce) and it looks quite functional.

I keep asking why you need a language interpreter other than javascript?

@umuro
Copy link
Contributor

umuro commented Apr 13, 2022

OPTIMIZATION ASPECT
In the future, an optimizer might

  • Change the execution order of statements
  • Add new variables to share execution
  • Remove variables to prevent unnecessary execution steps

Change the execution order of statements An exported array of expressions is not reliable
Add new variables to share execution Bindings will be cluttered
Remove variables to prevent unnecessary execution steps Variables that you depend on will disappear. Why the hell the optimizer will remove variables? Because the combined expression without the variable assignment will be simplified sometimes. If I force the variable then I prevent the simplification also.

So relying on exports and bindings says "don't optimize". We should look for something more explicit.

proportion_funding_available = 0.7 to 0.8
total_funding_available = donation_size * proportion_funding_available
export proportion_funding_available, total_funding_available

This is even more proper

import donation_size
proportion_funding_available = 0.7 to 0.8
total_funding_available = donation_size * proportion_funding_available
export proportion_funding_available, total_funding_available

Maybe import/export are not the right keywords. Think

need donation_size
proportion_funding_available = 0.7 to 0.8
total_funding_available = donation_size * proportion_funding_available
show  proportion_funding_available, total_funding_available

@Hazelfire

Also imagine I add type checking to this language. Blocks that does not return or export is a type-checking headache.

@quinn-dougherty quinn-dougherty added the Language Regarding Squiggle language semantics, distributions and function registry label Apr 13, 2022
@umuro
Copy link
Contributor

umuro commented Apr 13, 2022

proportion_funding_available = 0.7 to 0.8
total_funding_available = donation_size * proportion_funding_available

Implicit import/export will be available. While discussing with Sam, I found a way to solve both within the design of the language.

@OAGr
Copy link
Contributor Author

OAGr commented Apr 13, 2022

Great to hear! Looking forward to the future steps for that.

@umuro
Copy link
Contributor

umuro commented Apr 14, 2022

@Hazelfire , @OAGr A future optimization is cancelled.
When a variable was not anymore used in following expressions, we were able to discard it so that it could be garbage collected ASAP. A usual assumption for a functional language. New approach is keep everything until the end.

Also when a variable was not used it would not be executed. Cancelled out...

@OAGr
Copy link
Contributor Author

OAGr commented Apr 16, 2022

This would be nice to eventually add, but would need something like this.
#301

umuro pushed a commit to umuro/squiggle that referenced this issue Apr 17, 2022
@quantified-uncertainty quantified-uncertainty locked and limited conversation to collaborators Apr 18, 2022
@quinn-dougherty quinn-dougherty converted this issue into discussion #310 Apr 18, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Language Regarding Squiggle language semantics, distributions and function registry
Projects
None yet
Development

No branches or pull requests

4 participants