Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add the spread operator: function calls #78

Open
drsm opened this issue Dec 20, 2018 · 31 comments
Open

add the spread operator: function calls #78

drsm opened this issue Dec 20, 2018 · 31 comments

Comments

@drsm
Copy link
Contributor

drsm commented Dec 20, 2018

console.log(1,...[2,3,4], 5, 6)

http://www.ecma-international.org/ecma-262/6.0/#sec-argument-lists-runtime-semantics-argumentlistevaluation
http://www.ecma-international.org/ecma-262/9.0/#sec-argument-lists

@277hz
Copy link

277hz commented Jun 3, 2021

@xeioex As everything needed to make it work already is implemented, I managed to come up with a patch for njs_parser.c: https://gist.github.com/277hz/b68936f0b845d96a3dda2bfaf025cb69

>> ((a, b, c, d) => console.log(a, b, c, d))(...[1, 3], ...[2, 4]);
1 3 2 4

>> ((a, b, c, d) => console.log(a, b, c, d))(...[1, 3], [2, 4]);
1 3 [2,4] undefined

>> ((a, b, c, d) => console.log(a, b, c, d))(...[1, 3], [2], 4]);
1 3 [2] 4

Edit: For arrays it is also already implemented halfway:
https://github.com/nginx/njs/blob/master/src/njs_parser.c#L1621-L1627

Edit2: Sorry, found a bug, will rework it. Applying the same logic to array does however seem to work, too.

@xeioex
Copy link
Contributor

xeioex commented Jun 3, 2021

Hi @277hz ,

Thanks for the patch

Before submitting next time, please consider running make unit_test and test262 test suite and reporting the test report diff.

how_to.txt
test262.patch

@277hz
Copy link

277hz commented Jun 4, 2021

I spent some more time trying to solve this and think I am very close now to a working solution for functions and arrays.
It is a little bit more complex than I anticipated. :)
@xeioex The patch was mainly ment to be a placeholder for my thoughts and to let you know that I am working on it. Unit testing makes sense of course.

Edit:
While still having some minor issues.

>> JSON.stringify(((a,b,c,d,e) => [a,b,c,d,e])(...[...[[1],...[[[2,...[...[3,4]]],...[3]]],3],2,3]))
'[[1],[[2,3,4],3],3,2,3]'

>> JSON.stringify([...[1,...[2],[[3,...[4]]]],3,...[5,...[6]],...[[[8,...[9]],...[10]],...[11]]])
'[1,2,[[3,4]],3,5,6,[[8,9],10],11]'

Getting the same results in e.g. Chrome:

'[[1],[[2,3,4],3],3,2,3]' == JSON.stringify(((a,b,c,d,e) => [a,b,c,d,e])(...[...[[1],...[[[2,...[...[3,4]]],...[3]]],3],2,3]))
true

'[1,2,[[3,4]],3,5,6,[[8,9],10],11]' == JSON.stringify([...[1,...[2],[[3,...[4]]]],3,...[5,...[6]],...[[[8,...[9]],...[10]],...[11]]])
true

Edit2:
make unit_test is looking good so far:

njs("[...]")
expected: "SyntaxError: Unexpected token "]" in 1"
     got: "SyntaxError: Missing "[" in spread expression in 1"
script tests: FAILED [5402/5403]
safe script tests: PASSED [5/5]
denormals tests: PASSED [9/9]
disabled denormals tests: PASSED [7/7]
module tests: PASSED [5/5]
externals tests: PASSED [84/84]
shared tests: PASSED [26/26]
interactive tests: PASSED [57/57]
timezone tests: PASSED [28/28]
regexp tests: PASSED [14/14]
vm_json tests: PASSED [6/6]
vm_value tests: PASSED [8/8]
vm_internal_api tests: PASSED [67/67]
TOTAL: FAILED [5718/5719]

test262 is running...

Edit3:
Some spread tests are still failing unforunately, but at all it seems very promising to me:

 === Summary ===
  - Ran 34658 tests
- - Passed 13411 tests (38.7%)
- - Failed 21247 tests (61.3%)
+ - Passed 13423 tests (38.7%)
+ - Failed 21235 tests (61.3%)

Edit4:

# grep -E "^(\+|\-).*(failed|passed).*" feature.diff
-=== language/expressions/array/spread-err-mult-err-unresolvable failed in strict mode ===
+language/expressions/array/spread-err-mult-err-unresolvable passed in strict mode
-=== language/expressions/array/spread-err-sngl-err-unresolvable failed in strict mode ===
+language/expressions/array/spread-err-sngl-err-unresolvable passed in strict mode
-=== language/expressions/array/spread-mult-iter failed in strict mode ===
-=== language/expressions/array/spread-mult-literal failed in strict mode ===
+=== language/expressions/array/spread-mult-iter failed in strict mode ===
+language/expressions/array/spread-mult-literal passed in strict mode
-=== language/expressions/array/spread-sngl-literal failed in strict mode ===
+language/expressions/array/spread-sngl-literal passed in strict mode
-=== language/expressions/call/spread-mult-empty failed in strict mode ===
+language/expressions/call/spread-mult-empty passed in strict mode
-=== language/expressions/call/spread-mult-literal failed in strict mode ===
+language/expressions/call/spread-mult-literal passed in strict mode
-=== language/expressions/call/spread-sngl-empty failed in strict mode ===
+language/expressions/call/spread-sngl-empty passed in strict mode
-=== language/expressions/call/spread-sngl-literal failed in strict mode ===
+language/expressions/call/spread-sngl-literal passed in strict mode
-=== language/expressions/new/spread-mult-empty failed in strict mode ===
+language/expressions/new/spread-mult-empty passed in strict mode
-=== language/expressions/new/spread-mult-literal failed in strict mode ===
+language/expressions/new/spread-mult-literal passed in strict mode
-=== language/expressions/new/spread-sngl-empty failed in strict mode ===
+language/expressions/new/spread-sngl-empty passed in strict mode
-=== language/expressions/new/spread-sngl-literal failed in strict mode ===
+language/expressions/new/spread-sngl-literal passed in strict mode

Failing happens for iterable objects as far as I can tell.

@277hz
Copy link

277hz commented Jun 4, 2021

Not all aspects of spreading will be solveable in the parser, but it should reduce the complexity without being much of a performance penalty.
Going further will require me to dig a bit deeper into the codebase.

@277hz
Copy link

277hz commented Jun 5, 2021

Getting closer :)

>> var a = [1,2,3], b = [0,...a,[4,...a]]; console.log(a,b);
[1,2,3] [0,1,2,3,[4,1,2,3]]

@277hz
Copy link

277hz commented Jun 6, 2021

@drsm @xeioex @lexborisov My idea is to add an operation if a name to be spread is found. Then copy the spread variable into the target array during, or more precise just before, NJS_VMCODE_PROPERTY_INIT and shift the following indexes accordingly. For this I would like to overwrite property of njs_vmcode_prop_set_t. Is there any way to make sure the link is not used anywhere else, or would it be better to just alloc a new njs_value_t and set the property operand to that link?

@drsm
Copy link
Contributor Author

drsm commented Jun 6, 2021

@277hz

shift the following indexes accordingly.

I don't think this is possible, because the shift index is dynamic.

Maybe the same way we do in template literals will work:

>> var a = [1,2,3]
undefined
>> var t0 = Array.prototype.concat.call([], 4, a)
undefined
>> t0[Symbol.isConcatSpreadable] = false
false
>> var b = Array.prototype.concat.call([], 0, a, t0)
undefined
>> b
[
 0,
 1,
 2,
 3,
 [
  4,
  1,
  2,
  3
 ]
]

@277hz
Copy link

277hz commented Jun 6, 2021

@drsm

I don't think this is possible, because the shift index is dynamic.

Using concat was on my mind, too, but I am looking for a way to avoid the overhead of allocating and copying over and over.
It does not occur to me that the shifting is dynamic when doing it at the NJS_VMCODE_PROPERTY_INIT step, as we are exactly at the position in the array where we want to be. From there we can add the elements from the spread (which needs to be iterable anyway), skip the next njs_vmcode_prop_set_t (that would add the name to the array otherwise) and let the vm continue as usual. As the next code step can be anything of course, there needs to be some logic to make sure all njs_vmcode_prop_set_t related to the initial array are shifted accordingly. Maybe I am wrong, can not know it better than you do :).

Edit:
I think I have a basic working concept for what I described:

# ./njs -c "var a = [0,1,...[2]]; var b = [...a,[...a,[-1,[-2,[-3,[[...a]],...a],-4],-5],-6],-7]; console.log(JSON.stringify(b));"
[0,1,2,[0,1,2,[-1,[-2,[-3,[[0,1,2]],0,1,2],-4],-5],-6],-7]

Same with node:

# node -e "var a = [0,1,...[2]]; var b = [...a,[...a,[-1,[-2,[-3,[[...a]],...a],-4],-5],-6],-7]; console.log(JSON.stringify(b));"
[0,1,2,[0,1,2,[-1,[-2,[-3,[[0,1,2]],0,1,2],-4],-5],-6],-7]

Edit2:
Most cases should be covered now and work as expected. As I was curious how it performs against node I ran a little test, showing my approach performs faster when spreading arrays. I also added support for Symbol.iterator, which is a little slower than node for now.

# make unit_test
...
TOTAL: PASSED [5719/5719]
 === Summary ===
  - Ran 34658 tests
- - Passed 13411 tests (38.7%)
- - Failed 21247 tests (61.3%)
+ - Passed 13431 tests (38.8%)
+ - Failed 21227 tests (61.2%)

@277hz
Copy link

277hz commented Jun 11, 2021

Small comparison between output of node and my evolving patch, first line corresponds to node, second to njs:

> var x = [0,1]; console.log(x.length, JSON.stringify(x = [,,...x,,...x,-1,-2,,[...x,,],,]), x.length);
2 [null,null,0,1,null,0,1,-1,-2,null,[0,1,null],null] 12
2 [null,null,0,1,null,0,1,-1,-2,null,[0,1,null],null] 12
> var x = [0,1]; console.log(x.length, JSON.stringify(x = [,...x,...x,-1,-2,,[...x,,]]), x.length);
2 [null,0,1,0,1,-1,-2,null,[0,1,null]] 9
2 [null,0,1,0,1,-1,-2,null,[0,1,null]] 9
> var x = [0,1]; console.log(x.length, JSON.stringify(x = [...x,...x,-1,-2,,[...x,,]]), x.length);
2 [0,1,0,1,-1,-2,null,[0,1,null]] 8
2 [0,1,0,1,-1,-2,null,[0,1,null]] 8
> var x = [0,1]; console.log(x.length, JSON.stringify(x = [...x,...x,-1,-2,,[...x,,-3],-4,...x]), x.length);
2 [0,1,0,1,-1,-2,null,[0,1,null,-3],-4,0,1] 11
2 [0,1,0,1,-1,-2,null,[0,1,null,-3],-4,0,1] 11
> var x = [0,1]; console.log(x.length, JSON.stringify(x = [[[...x],...x],-1,-2,,[...x,,-3],-4,...x,...[...x],-5]), x.length);
2 [[[0,1],0,1],-1,-2,null,[0,1,null,-3],-4,0,1,0,1,-5] 11
2 [[[0,1],0,1],-1,-2,null,[0,1,null,-3],-4,0,1,0,1,-5] 11
> var x = [,,0,1,,]; console.log(x.length, JSON.stringify(x = [,[...x],...[[...x]],-1,-2,,...[[...x],,-3],-4,...x]), x.length);
5 [null,[null,null,0,1,null],[null,null,0,1,null],-1,-2,null,[null,null,0,1,null],null,-3,-4,null,null,0,1,null] 15
5 [null,[null,null,0,1,null],[null,null,0,1,null],-1,-2,null,[null,null,0,1,null],null,-3,-4,null,null,0,1,null] 15
> for(let jj = 0; jj < 2; jj++) { let x = []; for(let i = 0; i < 3; i++) console.log(x.length, JSON.stringify(x = [...x, i]), x.length); }
0 [0] 1#1 [0,1] 2#2 [0,1,2] 3#0 [0] 1#1 [0,1] 2#2 [0,1,2] 3#
0 [0] 1#1 [0,1] 2#2 [0,1,2] 3#0 [0] 1#1 [0,1] 2#2 [0,1,2] 3#
> for(let jj = 0; jj < 2; jj++) { let x = [,]; for(let i = 0; i < 3; i++) console.log(x.length, JSON.stringify(x = [...x, i]), x.length); }
1 [null,0] 2#2 [null,0,1] 3#3 [null,0,1,2] 4#1 [null,0] 2#2 [null,0,1] 3#3 [null,0,1,2] 4#
1 [null,0] 2#2 [null,0,1] 3#3 [null,0,1,2] 4#1 [null,0] 2#2 [null,0,1] 3#3 [null,0,1,2] 4#
> for(let jj = 0; jj < 2; jj++) { let x = []; for(let i = 0; i < 3; i++) console.log(x.length, JSON.stringify(x = [, ...x, i]), x.length); }
0 [null,0] 2#2 [null,null,0,1] 4#4 [null,null,null,0,1,2] 6#0 [null,0] 2#2 [null,null,0,1] 4#4 [null,null,null,0,1,2] 6#
0 [null,0] 2#2 [null,null,0,1] 4#4 [null,null,null,0,1,2] 6#0 [null,0] 2#2 [null,null,0,1] 4#4 [null,null,null,0,1,2] 6#
> for(let jj = 0; jj < 2; jj++) { let x = [,]; for(let i = 0; i < 2; i++) console.log(x.length, JSON.stringify(x = [, ...x, i,,]), x.length); }
1 [null,null,0,null] 4#4 [null,null,null,0,null,1,null] 7#1 [null,null,0,null] 4#4 [null,null,null,0,null,1,null] 7#
1 [null,null,0,null] 4#4 [null,null,null,0,null,1,null] 7#1 [null,null,0,null] 4#4 [null,null,null,0,null,1,null] 7#

Did I miss any other weird combinations that are possible in javascript?

Performance wise it looks like this:

njs 0.6.0
x: let x = []; / o: let o = x.length; / a: for(let i = 0; i < j; i++) x = [...x, i + o]; / b: x = x.reduce((x, y) => [...x, y], []); / c: for(let i = 0; i < 1000; i++) x = [...x];
...
3.  run:   time:                               j / x.length:               x:
11   a:     2 ms    b:     1 ms    c:     2 ms  1000 /   1000  [0,1].......[998,999]
12   a:     8 ms    b:    10 ms    c:     5 ms  2000 /   3000  [0,1].....[2998,2999]
13   a:    22 ms    b:    34 ms    c:     9 ms  3000 /   6000  [0,1].....[5998,5999]
14   a:    59 ms    b:    89 ms    c:    17 ms  4000 /  10000  [0,1].....[9998,9999]
15   a:   232 ms    b:   229 ms    c:    30 ms  5000 /  15000  [0,1]...[14998,14999]
 
node v16.2.0
x: let x = []; / o: let o = x.length; / a: for(let i = 0; i < j; i++) x = [...x, i + o]; / b: x = x.reduce((x, y) => [...x, y], []); / c: for(let i = 0; i < 1000; i++) x = [...x];
...
3.  run:   time:                               j / x.length:               x:
11   a:     2 ms    b:     3 ms    c:     2 ms  1000 /   1000  [0,1].......[998,999]
12   a:    20 ms    b:    23 ms    c:     7 ms  2000 /   3000  [0,1].....[2998,2999]
13   a:    72 ms    b:    95 ms    c:    13 ms  3000 /   6000  [0,1].....[5998,5999]
14   a:   172 ms    b:   264 ms    c:    21 ms  4000 /  10000  [0,1].....[9998,9999]
15   a:   937 ms    b:  1.20  s    c:    31 ms  5000 /  15000  [0,1]...[14998,14999]

@drsm
Copy link
Contributor Author

drsm commented Jun 11, 2021

Did I miss any other weird combinations that are possible in javascript?

How about Symbol.iterator hijacking?

> var x = []; x[Symbol.iterator] = () => [1,2,3].values(); [...x]
[ 1, 2, 3 ]

Also, just to be sure:

> var x = [,]; Object.keys([...x,,...x])
[ '0', '2' ]
> var x = [,]; [...x].every((x) => typeof x == 'undefined')
true

@lexborisov
Copy link
Contributor

Hi @277hz

If your patch is ready I can review it on Monday.

@277hz
Copy link

277hz commented Jun 11, 2021

@drsm Thank you, just the examples I was looking for.

@lexborisov: Whats mostly missing still, is support for spreading function arguments, as I have not decided on how to manipulate the list of arguments passed to the function yet. While I think I am getting closer, Monday will not be the finish line I suppose :).

Edit:

> var x = [,]; console.log(JSON.stringify(Object.keys([...x,,...x])));
["0","2"]
["0","2"]
> var x = []; x[Symbol.iterator] = () => [1, 2, 3].values(); console.log(JSON.stringify([...x, 4]));
[1,2,3,4]
[1,2,3,4]
> var x = [,]; console.log(JSON.stringify([...x].every((x) => typeof x == 'undefined')));
true
true

@277hz
Copy link

277hz commented Jun 13, 2021

Almost there:

 === Summary ===
  - Ran 34658 tests
- - Passed 13513 tests (39.0%)
- - Failed 21145 tests (61.0%)
+ - Passed 13573 tests (39.2%)
+ - Failed 21085 tests (60.8%)

script tests: PASSED [5426/5426]
safe script tests: PASSED [5/5]
denormals tests: PASSED [9/9]
disabled denormals tests: PASSED [7/7]
module tests: PASSED [5/5]
externals tests: PASSED [84/84]
shared tests: PASSED [26/26]
interactive tests: PASSED [57/57]
timezone tests: PASSED [28/28]
regexp tests: PASSED [14/14]
vm_json tests: PASSED [6/6]
vm_value tests: PASSED [8/8]
vm_internal_api tests: PASSED [67/67]
TOTAL: PASSED [5742/5742]

feature.diff

@277hz
Copy link

277hz commented Jun 13, 2021

@lexborisov https://gist.github.com/277hz/b68936f0b845d96a3dda2bfaf025cb69
Thank you for taking the time. I really wanted to leave some architectural decisions up to you, like adding a new vmcode_t, moving everything into its own function in njs_vmcode.c, or where to throw what kind of error messages and when to silently pass on.
Other than that I took great caution to follow on what your code is already doing and commenting on what my intentions are.

I just ran a final test262 with no difference to my previous post and did not happen to find any non working circumstances whatsoever.

Edit:
The one or other token still needs to be handled additionally as I noticed while performing some more testing. Anyway not related to the general logic at all.

@drsm
Copy link
Contributor Author

drsm commented Jun 14, 2021

@277hz

>> var a = [], b = () => [...a,...a]; b();
Segmentation fault (core dumped)
> [...([1,2,3])]
[ 1, 2, 3 ]
> [...(void 0, [1,2,3])]
[ 1, 2, 3 ]
> var a = [1,2,3]; [...(a)]
[ 1, 2, 3 ]
> [...{ [Symbol.iterator]: () => [1,2,3].values() }]
[ 1, 2, 3 ]
> [...{}]
Uncaught TypeError: object is not iterable (cannot read property Symbol(Symbol.iterator))

@277hz
Copy link

277hz commented Jun 14, 2021

@drsm Working on it and still confident that the idea of shifting the njs_vmcode_prop_set_t before adding them to the array is a good way to go. To me it looks like spreading of objects could be taken care of in a similar fashion.

Edit:

>> console.log([...([1,2,3])]);
console.log([...(void 0, [1,2,3])]);
var a = [1,2,3]; console.log([...(a)]);
console.log([...{ [Symbol.iterator]: () => [1,2,3].values() }]);
console.log([...{}]);
[1,2,3]
[1,2,3]
[1,2,3]
[1,2,3]
Thrown:
TypeError: object is not iterable
    at console.log (native)
    at main (shell:5)

@lexborisov
Copy link
Contributor

Hi @277hz

Thanks for the work you've done.
The patch above is the final version?

@277hz
Copy link

277hz commented Jun 15, 2021

@lexborisov Not yet, still have some work to do. After fixing what @drsm described, I think there are some situations left that do not run smoothly.

Edit:
As we can leak large amounts of memory with spreading easily, what i would love to have is cleaning up stuff like this:

>> let x = [1,2,3]; for(let i = 0; i < 3; i++) console.log(x = [...x]);
>> let x = [1,2,3]; for(let i = 0; i < 3; i++) console.log(x = [...x, i]);
>> let x = [1,2,3]; for(let i = 0; i < 3; i++) console.log(x = x.reduce((a, b) => [...a, b], []));

I guess it is time for garbage collecting 😉, do you have any plans for implementing?

Edit2:
I took the time to restructure everything a little bit so it fits better into the existing codebase, while doing that I started working on spreading objects 😄:

>> var a = {...{a:1},...{b:2}}; console.log(a);
{a:1,b:2}

>> console.log({...{a:1,b:2},...{c:3}})
{a:1,b:2,c:3}

>> console.log({...{a:1,b:2},...{b:3}})
{a:1,b:3}

@277hz
Copy link

277hz commented Jun 17, 2021

@lexborisov Final patch https://gist.github.com/277hz/b68936f0b845d96a3dda2bfaf025cb69
Adds support for spreading of arrays, objects and function arguments, solving #78 #79 #92.

Passing all spread tests in test262, except those using a generator function:

 === Summary ===
  - Ran 34658 tests
- - Passed 13513 tests (39.0%)
- - Failed 21145 tests (61.0%)
+ - Passed 13627 tests (39.3%)
+ - Failed 21031 tests (60.7%)

feature.diff

Edit:
While having some spare time left, I am already getting ideas where to reuse the Symbol.iterator parts, e.g. with for...of. From my point of view it seems to be favorable to first merge what is there so far and avoid resolving too many conflicts later. Please let me know if you even consider doing that.

Edit2:
Found some more minor things to fix, will provide an updated patch soon.

@lexborisov
Copy link
Contributor

@277hz

I looked at your patch, and before doing anything with spread, we need to implement iterators and generators. This is what I am busy with now. After that, I'll go back to your patch.

@277hz
Copy link

277hz commented Jun 23, 2021

@lexborisov Thank you for taking the time.
I was busy doing a lot of fixing and refactoring, which should help significantly with merging everything later.
What I have been achieving so far is:

  • Spreading into arrays:
    • From arrays
    • From strings
    • From Symbol.iterator
    • Expanding target array as needed
  • Spreading into objects:
    • From objects
    • From arrays
    • From strings
    • Resolving properties as needed
  • Spreading into function arguments:
    • From arrays
    • From strings
    • From Symbol.iterator
    • Expanding frame as needed
  • Added custom token and vmcode
  • Added code to disassembler logic
  • Added simple benchmarks and unit tests

Additionaly I moved most of the spread logic into its own file, except for what is needed in the parser and generator.
I am pretty confident that those aspects of spreading that I can influence without touching any basics in njs are covered now and working. From my obversations no overhead is generated to existing code and everything, apart from that node sadly seems to run some things a lot faster than njs, is performing well.

To make sure results are as expected, I conducted a little bash script for comparing output between njs and node, currently containing 115 expressions and 10 benchmarks, that you are free to look up here compare_njs_node.sh. If you come up with more ideas for what to test, just let me know.

@xeioex
Copy link
Contributor

xeioex commented Jun 23, 2021

@277hz

apart from that node sadly seems to run some things a lot faster than njs, is performing well.

node uses V8, the state-of-the-art JIT compiler for JavaScript (developed for more than 10 year by hundreds of people), it is super fast but uses some low-level magic for each supported architecture (x86, ARM, MIPS), it is also very large (28 mb executable).

njs is interpreter, it is better to compare it to other interpreters (ducktape, quickjs, xs etc.).

But we plan to work on execution speed also:

@277hz
Copy link

277hz commented Jun 23, 2021

@xeioex Don't get me wrong 😄. I really like how you designed njs, and of course nginx, too...

I stumbled upon that during testing of spreading objects, order can really be important when accessing properties.

Other than that what really stood out to me is the execution time of for and while loops.
Sample output of the bash script i linked above:

9 > var d = Date.now(); for (let i = 0; i < 100000000; i++); d = Date.now() - d; console.log(d);
njs:   6247     m:     13.20kb  u:  6.24s
node:   271     m:    136.67kb  u:  0.33s
10 > var d = Date.now(); var i = 0; while (i < 100000000) i++; d = Date.now() - d; console.log(d);
njs:   6291     m:     13.09kb  u:  6.28s
node:   483     m:    136.59kb  u:  0.55s

@xeioex
Copy link
Contributor

xeioex commented Jun 23, 2021

@277hz

var d = Date.now(); for (let i = 0; i < 100000000; i++); d = Date.now() - d; console.log(d);

./build/njs -d
interactive njs 0.6.1

v.<Tab> -> the properties and prototype methods of v.

>> for (let i = 0; i < 100; i++);
shell:main
    1 | 00000 LET               0221
    1 | 00016 MOVE              0221 0233
    1 | 00040 JUMP              48
@ 1 | 00056 POST INC          0043 0221 0221 
@ 1 | 00088 LESS              0043 0221 0433
@ 1 | 00120 JUMP IF TRUE      0043 -64
    1 | 00144 STOP              0033

undefined

Here, basically we do the large switch over and over again for each instruction marked with '@', JIT compiler eliminates the switch, and execute a cycle with almost native speed. It also may have code-flow analyser and which may eliminate the loop completely as not having any side-effects.

@xeioex
Copy link
Contributor

xeioex commented Jun 23, 2021

@277hz

We greatly appreciate your effort and enthusiasm.
Here are the some principles we adhere in this project:

  1. before introducing some architectural changes, or introducing new mechanisms we discuss it before coding
  2. we are trying to make the code homogenous (using not only the same coding style, but the same primitives and principles everywhere)
  3. we are in the process of moving toward the ecma262 specification (no some special hacks for strings, object, arrays as it used to be). this makes the performance somewhat slower, but we are planing to regain speed after the refactoring.
  4. the code review is strict and may require many rounds of iteratively improving the patch. The best way to succeed (in getting your patch committed) is to split a large patch into a series of small orthogonal changes.

The best way to succeed is to start small, with small changes, gradually moving toward more advanced features (spanning many subsystems).

Please also note that: njs is not an experimental code, it is used in production. That is why we are hesitant about quickly accepting large code chunks.

@277hz
Copy link

277hz commented Jun 23, 2021

@xeioex I am trying to obey. At first I thought it is going to be smaller changes only, the biggest part acutally consists of the routine for shifting array indexes. While spending more time on the topic, there was just no reason to not add support for objects and functions, too. Ending up in restructuring my code into a new file, making it a lot cleaner and easier to reverse if required, I avoided touching anything existing besides adding some code to the parser for putting the required token into place and generator for consuming it. While trying to mimic your style from what I see, it is always harder being outside of the team and guess what the preferred way is. One thing I can assure though is my commitment to writing code that works, commenting it extensively and testing thoroughly.

In case you are curious, this is how native for and while perform:

# gcc -o testloop testloop.c && ./testloop
0.46892
0.41777
# gcc -march=native -o testloop testloop.c && ./testloop
0.46842
0.41437
# gcc -O3 -o testloop testloop.c && ./testloop
0.00000
0.00000

-O3 is surely outstanding, as all three yield the same result.

@277hz
Copy link

277hz commented Jun 25, 2021

Patch updated: njs_spread.diff
Please feel free to play around with: compare_njs_node.sh

Attached the latest feature.diff:

 === Summary ===
  - Ran 34658 tests
- - Passed 13514 tests (39.0%)
- - Failed 21144 tests (61.0%)
+ - Passed 13627 tests (39.3%)
+ - Failed 21031 tests (60.7%)

Performance is like this:

# ./compare_njs_node.sh -b 0
...
1 > var d = Date.now(); var a = [...a=[...new Array(1000000).fill(2)], ...a=[...a], ...a=[...a], ...a=[...a], ...a=[...a]]; d = Date.now() - d; console.log(d);
njs:    220     m:    684.17kb  u:  0.10s
node:   477     m:    819.12kb  u:  0.45s
2 > var d = Date.now(); var a; (() => {})(...a=new Array(25000).fill(2), ...a=[...a], ...a=[...a], ...a=[...a], ...a=[...a]); d = Date.now() - d; console.log(d);
njs:      2     m:     21.18kb  u:  0.00s
node:    28     m:    164.54kb  u:  0.09s
3 > var d = Date.now(); var a; (function() {})(...a = new Array(5000).fill(2), ...a=[...a], ...a=[...a], ...a=[...a], ...a=[...a]); d = Date.now() - d; console.log(d);
njs:      3     m:     16.93kb  u:  0.00s
node:     5     m:    131.57kb  u:  0.07s
4 > var d = Date.now(); var a = {...a={...new Array(100000).fill(2)}, ...a={...a}, ...a={...a}, ...a={...a}, ...a={...a}}; d = Date.now() - d; console.log(d);
njs:    365     m:    438.48kb  u:  0.29s
node:   628     m:    312.37kb  u:  0.65s
5 > var d = Date.now(); var a; (() => {})(...a={...new Array(25000).fill(2), [Symbol.iterator]:()=>Object.values(a).values()}, ...a={...a}, ...a={...a}, ...a={...a}, ...a={...a}); d = Date.now() - d; console.log(d);
njs:     61     m:    111.26kb  u:  0.04s
node:    63     m:    177.04kb  u:  0.11s
6 > var d = Date.now(); var a; (function() {})(...a={...new Array(5000).fill(2), [Symbol.iterator]:()=>Object.values(a).values()}, ...a={...a}, ...a={...a}, ...a={...a}, ...a={...a}); d = Date.now() - d; console.log(d);
njs:     16     m:     35.67kb  u:  0.01s
node:    13     m:    131.87kb  u:  0.05s

Edit: Made AST work, ast_disassembled.txt.

@277hz
Copy link

277hz commented Jun 28, 2021

While providing the latest update for njs_spread.diff and hoping to come nearer to an end 😄, I wanted to shed some insight on what the patch modifies technically. It would be nice if someone finds the time for reviewing, I greatly appreciate it.

Handled in parser without additional vmcode:
[ ...[1,2,3], ...[1,2,3], 4, 5, 6 ]
{ ...{a:1,b:2,c:3}, ...{a:1,b:2,c:3}, d:4, e:5, f:6 }
(()=>{})( ...[1,2,3], ...[1,2,3], 4, 5, 6 )

Handled in vm after resolving parenthesis:
[ ...([1,2,3]), ...([1,2,3]), 4, 5, 6 ]
 ^             ^                     !
{ ...({a:1,b:2,c:3}), ...({a:1,b:2,c:3}), d:4, e:5, f:6 }
 ^                   ^
(()=>{})( ...([1,2,3]), ...([1,2,3]), 4, 5, 6 )
         ^             ^                     !

Handled in vm after resolving expression:
var a = [1,2,3]; [ ...a, ...a, 4, 5, 6 ]
                  ^     ^             !
var a = {a:1,b:2,c:3}; { ...a, ...a, d:4, e:5, f:6 }
                        ^     ^
var a = [1,2,3]; (()=>{})( ...a, ...a, 4, 5, 6 )
                          ^     ^             !

For "^" NJS_TOKEN_NAME_SPREAD with a reference to spread source (right) and spread target (left) is injected during parsing.
For "!" NJS_TOKEN_NAME_SPREAD end indicator with a reference to spread target (right) and node/target left (left) is appended as left of current node/target.

For every injected NJS_TOKEN_NAME_SPREAD not an end indicator one njs_vmcode_name_spread_t with three operands (spread target type, spread target, spread source) is inserted after generating code for the spread source (right) first.
For NJS_TOKEN_NAME_SPREAD end indicators one njs_vmcode_name_spread_t with two operands (NJS_NULL, spread target) is inserted, if set code for left is generated first.

Copying from spread target into spread source is handled in vm. When NJS_VMCODE_NAME_SPREAD is run vmcode is iterated until end indicator is found, setting the newly introduced shifted_property operand of related njs_vmcode_prop_set_t in between. Finally sum of sizeof one njs_vmcode_name_spread_t and one njs_vmcode_prop_set_t is returned as njs_jump_off_t. The shifted_property operand is then preferred during upcoming NJS_VMCODE_PROP_INIT. Resizing of spread targets and filling in invalid values if applicable is performed accordingly.

This can also be visualized in AST and disassembled code neatly after applying the patch.

While I see room for improvements, this approach has the advantage of NJS_VMCODE_NAME_SPREAD always being at the desired position within the target when run in vm and having the possibility to respect Symbol.iterator. Additionaly changes to the code base are minimized, except for adding one function to the generator and the modifications to the parser all other code is contained in its own source files. Moreover one might argue that it is performing fast enough already.

The only case inoperable for now is copying to and from large not flat arrays.
After generator functions are implemented they may need to be reflected somehow.

@drsm
Copy link
Contributor Author

drsm commented Jun 28, 2021

@277hz
Hi!
One thing about these optimizations:

//Handled in parser without additional vmcode:
[ ...[1,2,3], ...[1,2,3], 4, 5, 6 ]
(()=>{})( ...[1,2,3], ...[1,2,3], 4, 5, 6 )

there is a weird case that can't be handled in compile time:

> var x = Array.prototype[Symbol.iterator]; Array.prototype[Symbol.iterator] = () => [1,2,3].values(); var y = [...[]]; Array.prototype[Symbol.iterator] = x; y;
[ 1, 2, 3 ]

@lexborisov
Copy link
Contributor

@277hz

Thank you for your big work!

Before starting a joint review of your patch, I suggest implementing generators (no asynchrony). Spread is uses with iterators and generators.
After that we will split your spread patch into several smaller ones.

@lexborisov
Copy link
Contributor

Hi @277hz

How are you? Are you still interested in this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants