Feature: grammar helper #85

mindplay-dk · 2023-08-12T16:02:38Z

What problem does this feature solve?

This is mainly for convenience - but it does solve the problem with the existing defer function, which relies initialization separate from creation. My proposed grammar helper would be statically type-checked, and wouldn't need an error-handler.

Describe the solution

If we look at the example for defer:

sigma/docs/docs/content/parsers/defer.md

Lines 33 to 64 in 0d96737

    
           interface NumberNode { 
        
             type: 'number' 
        
             span: Span 
        
             value: number 
        
           } 
        
           interface ListNode { 
        
             type: 'list' 
        
             span: Span 
        
             value: Array<NumberNode | ListNode> 
        
           } 
        
           const TupleList = defer<ListNode>() 
        
           const TupleNumber = defer<NumberNode>() 
        
           TupleNumber.with( 
        
             map( 
        
               integer(), 
        
               (value, span) => ({ type: 'number', span, value }) 
        
             ) 
        
           ) 
        
           TupleList.with( 
        
             map( 
        
               takeMid( 
        
                 string('('), 
        
                 sepBy(choice(TupleList, TupleNumber), string(',')), 
        
                 string(')') 
        
               ), 
        
               (value, span) => ({ type: 'list', span, value }) 
        
             ) 
        
           )

Here is that example implemented with the grammar helper:

  const tupleGrammar = grammar({
    tupleNumber(): Parser<NumberNode> {
      return map(
        integer(),
        (value, span) => ({ type: 'number', span, value })
      )
    },
    tupleList(): Parser<ListNode> {
      return map(
        takeMid(
          string('('),
          sepBy(choice(this.tupleList, this.tupleNumber), string(',')),
          string(')')
        ),
        (value, span) => ({ type: 'list', span, value })
      )
    },
  });

Here is a test demonstrating how to use the resulting grammar:

  is.equal(
    run(tupleGrammar.tupleList).with('(1,2,(3,(4,5)))'),
    {
      isOk: true,
      span: [ 0, 15 ],
      pos: 15,
      value: {
        type: 'list',
        span: [ 0, 15 ],
        value: [
          { type: 'number', span: [ 1, 2 ], value: 1 },
          { type: 'number', span: [ 3, 4 ], value: 2 },
          {
            type: 'list',
            span: [ 5, 14 ],
            value: [
              { type: 'number', span: [ 6, 7 ], value: 3 },
              {
                type: 'list',
                span: [ 8, 13 ],
                value: [
                  { type: 'number', span: [ 9, 10 ], value: 4 },
                  { type: 'number', span: [ 11, 12 ], value: 5 }
                ]
              }
            ]
          }
        ]
      }
    }
  );

And here is my preliminary implementation:

import { Parser } from "@nrsk/sigma";

type Grammar<T> = {
  [P in keyof T]: T[P] extends () => any ? ReturnType<T[P]> : never;
};

type GrammarInit<T> = T & ThisType<Grammar<T>>;

type GrammarType = {
  [name: string]: () => Parser<any>;
};

export function grammar<T extends GrammarType>(init: GrammarInit<T>): Grammar<T> {
  const grammar = {} as { [key: string]: Parser<any> };

  const initialized = {} as { [key: string]: true };
  
  for (const key in init) {
    grammar[key] = {
      parse(input, pos) {
        if (! initialized[key]) {
          initialized[key] = true;

          grammar[key] = (init[key] as any).apply(grammar);
        }
        
        return grammar[key].parse(input, pos);
      },
    } as Parser<any>;
  }

  return grammar as Grammar<T>;
}

Here is a screenshot demonstrating IDE support:

As you can see, this works with the circular references, which is possible with the magical ThisType in TS.

It's doing basically the same thing as defer for each member, so of course this works with circular references at run-time as well.

I didn't benchmark it against defer, and it might need some optimization, and the types could probably use a little work.

But what do you think, would you welcome a PR for this feature? 🙂

The text was updated successfully, but these errors were encountered:

mindplay-dk · 2023-08-14T11:21:29Z

I wanted to know if this approach was faster or slower - it is slower, and I did attempt some optimizations, but I'm not an expert, and the benchmark is really unreliable on my system for some reason... I'm seeing anywhere from 1% to 10% overhead compared with defer, so I really have no idea if any of the optimizations I tried were making things faster or slower - so I left it the way it is for now.

On this branch, you can also see what the JSON parser would look like in this format.

mindplay-dk/sigma@add-core-folder...mindplay-dk:sigma:grammar-helper

I quite like the encapsulation, the type-safety, and removing the potential for uninitialized parsers - but I'm not sure what the actual overhead is, and so I can't really say if I think it's worth while, or whether it could be optimized.

There might be some inherent overhead due to the use of this in the JSON parser implementation itself, not sure... 🤔

mindplay-dk · 2023-08-14T12:27:35Z

Here's a simplified version:

https://github.com/mindplay-dk/sigma/blob/605780ab4ce6eb7d8bc335c62b7c2b7acdcd88c1/src/core/grammar.ts#L13-L29

export function grammar<T extends GrammarType>(init: GrammarInit<T>): Grammar<T> {
  const grammar = {} as { [key: string]: Parser<unknown> }

  for (const key in init) {
    grammar[key] = {
      parse() {
        throw new Error()
      }
    }
  }

  for (const key in init) {
    grammar[key].parse = init[key].apply(grammar).parse
  }

  return grammar as Grammar<T>
}

Just putting in "dummy" parsers for everything in the first loop, so that we get objects with a parse property that other objects can refer to, just to establish the circular references - then replace the parse property in a separate loop, "patching" the parsers.

Logically, this should be faster, because we're no longer doing anything dynamic - but it's quite possible that replacing a function like this breaks some sort of optimization in V8, I really can't say without a benchmark.

What's really suspicious, is if I swap the order of the benchmarks, so that grammar goes before defer rather than after, grammar consistently wins out by the same 10-15% margin - so there is something wrong with the benchmark, I think. 🤨

mindplay-dk · 2023-08-14T12:41:47Z

For the record, opened an issue here to find out if this benchmark is still valid.

mindplay-dk · 2023-08-15T10:38:19Z

So yeah, I have to conclude the benchmark package (used by benny) probably isn't reliable anymore.

caderek/benny#57 (comment)

Not sure how to address this.

Either way, having run the benchmarks in isolation, I'm now certain the grammar version of the JSON parser is in fact slightly faster than the defer version.

So I think it makes sense to add this feature?

If you agree, I'll try to get it wrapped up with tests and docs soon 🙂

norskeld · 2023-08-15T17:08:46Z

Yeah, benchmarking in Node/JS is super flaky, and benchmark.js seems to add headache in some cases. Not sure what we can do here (roll our own benchmarking library, perhaphs? 😁)

As for the feature itself, it does looks interesting and the implementation is kinda clever! Let's give it a go.

mindplay-dk · 2023-08-16T09:32:11Z

Yeah, benchmarking in Node/JS is super flaky, and benchmark.js seems to add headache in some cases. Not sure what we can do here (roll our own benchmarking library, perhaphs? 😁)

Yeah, I was tempted already and start mucking about with some maths here:

https://github.com/mindplay-dk/sigma/blob/crazy-math/benchmarks/src/json/index.ts

There isn't really anything maths can do to fix this problem though - after all the testing I did over the last couple of days, I'm fairly confident there is some sort of V8/JS engine optimization that applies only to the first module that loads.

Not sure where I'm going with this - if I decide to do something about this, it would need to have process isolation somehow, which probably means it'll have limitations on what you can test (e.g. Workers don't expose the DOM) or it'll work on Node only. (forking or launching a dedicated process for each test.)

I might do another round at some point and scout for an existing benchmark library that works...

As for the feature itself, it does looks interesting and the implementation is kinda clever! Let's give it a go.

I will try to get this wrapped up soon. 🙂👍

mindplay-dk · 2023-08-16T09:36:03Z

... aha! 🤔

mindplay-dk · 2023-08-16T11:02:41Z

Nope, couldn't get that to work.

Llorx/iso-bench#1

We'll see if the author responds - it's a very new project, but it does sound really promising. 🙂

Llorx · 2023-08-16T22:00:21Z

Updated https://github.com/Llorx/iso-bench with an API easier to implement 🙂 Just commented on the issue there with your code @mindplay-dk working with the old API and also it working with the new API.

mindplay-dk · 2023-08-17T09:56:26Z

Thanks @Llorx 🙂👍

@norskeld lets consider the benchmark issue as a separate issue from the grammar helper, yeah?

norskeld · 2023-08-17T11:09:36Z

Sure, okay.

norskeld self-assigned this Aug 12, 2023

norskeld added C-feature-request Category: A feature request, i.e. not implemented A-misc Area: Issues that don't fint into specific category labels Aug 12, 2023

norskeld added C-feature-accepted Category: A feature request that has been accepted and awaiting implementation and removed C-feature-request Category: A feature request, i.e. not implemented labels Aug 15, 2023

norskeld assigned mindplay-dk and unassigned norskeld Aug 15, 2023

mindplay-dk mentioned this issue Aug 16, 2023

Grammar helper #93

Merged

norskeld linked a pull request Aug 19, 2023 that will close this issue

Grammar helper #93

Merged

norskeld closed this as completed in #93 Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: grammar helper #85

Feature: grammar helper #85

mindplay-dk commented Aug 12, 2023

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 15, 2023

norskeld commented Aug 15, 2023

mindplay-dk commented Aug 16, 2023

mindplay-dk commented Aug 16, 2023

mindplay-dk commented Aug 16, 2023

Llorx commented Aug 16, 2023 •

edited

Loading

mindplay-dk commented Aug 17, 2023

norskeld commented Aug 17, 2023

Feature: grammar helper #85

Feature: grammar helper #85

Comments

mindplay-dk commented Aug 12, 2023

What problem does this feature solve?

Describe the solution

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 14, 2023

mindplay-dk commented Aug 15, 2023

norskeld commented Aug 15, 2023

mindplay-dk commented Aug 16, 2023

mindplay-dk commented Aug 16, 2023

mindplay-dk commented Aug 16, 2023

Llorx commented Aug 16, 2023 • edited Loading

mindplay-dk commented Aug 17, 2023

norskeld commented Aug 17, 2023

Llorx commented Aug 16, 2023 •

edited

Loading