Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type.match and Type.instantiate for NonTerminalType #1844

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jurgenvinju
Copy link
Member

@jurgenvinju jurgenvinju commented Jul 14, 2023

Trying to implement Type.match and Type.instantiate for NonTerminalType such that pattern matching works for types like {&T &E}* and &O?

  • If that works, then generic library functions like sort, size, zip and reverse can be written for syntax lists
  • Generally some things become simpler for concrete syntax analysis which are now "given" for abstract syntax and abstract lists.
  • It's a big step towards finishing the concrete syntax features.
  • Most complex code written for this works both for the interpreter and the compiled run-time as it is a part of the run-time type system of Rascal parse trees.

…pe such that pattern matching works for types like {&T &E}* and &O?, such that generic functions like sort, size, zip and reverse can be written for syntax lists, such that some things become simpler for concrete syntax analysis
@jurgenvinju jurgenvinju changed the title trying to implement Type.match and Type.instantiate for NonTerminalType such that pattern matching works for types like {&T &E}* and &O?, such that generic functions like sort, size, zip and reverse can be written for syntax lists, such that some things become simpler for concrete syntax analysis Type.match and Type.instantiate for NonTerminalType Jul 14, 2023
@jurgenvinju
Copy link
Member Author

Currently "everything" is implemented but I managed to break the parser generator. In particular the code that uses abstract patterns to match against concrete trees is now broken.

@jurgenvinju
Copy link
Member Author

jurgenvinju commented Jul 14, 2023

The current milestones for this branch are:

  • do not break existing code
  • make this work: int size({&Elem &Sep}* l) = (0 | it + 1 | _ <- l);

…ns regularly when matching empty lists of parse trees
@codecov
Copy link

codecov bot commented Jul 14, 2023

Codecov Report

Merging #1844 (47fb001) into main (22a2833) will decrease coverage by 1%.
The diff coverage is 19%.

@@           Coverage Diff            @@
##              main   #1844    +/-   ##
========================================
- Coverage       49%     48%    -1%     
- Complexity    6091    6138    +47     
========================================
  Files          670     670            
  Lines        58698   58881   +183     
  Branches      8544    8603    +59     
========================================
+ Hits         28792   28825    +33     
- Misses       27729   27828    +99     
- Partials      2177    2228    +51     
Impacted Files Coverage Δ
src/org/rascalmpl/types/RascalType.java 20% <ø> (ø)
...org/rascalmpl/values/parsetrees/SymbolAdapter.java 25% <17%> (-3%) ⬇️
src/org/rascalmpl/types/NonTerminalType.java 52% <62%> (+<1%) ⬆️

... and 9 files with indirect coverage changes

@jurgenvinju jurgenvinju marked this pull request as ready for review July 14, 2023 17:53
@jurgenvinju
Copy link
Member Author

Next steps before this can be merged:

  • write some tests
  • add several useful generic syntax list functions to List: size, zip, reverse, unzip

@jurgenvinju
Copy link
Member Author

jurgenvinju commented Jul 17, 2023

#1835 is now needed to finish this work correctly. Consider:

syntax X = A*;
lexical B = A*;
lexical A = "a"
layout W = [ ]*;

Now we have two A* types in the grammar, one of which becomes a separated list: {A W}* and the other remains A*.

Then:

A* f(A* x) is ambiguous. It could either be the lexical or the syntax list!

  • it can't be both because of Liskov's substitution principle; by accepting both, either could come out again, which would be type-incorrect.
  • We need to know which it is.

In #1835 we propose to have syntax[...] and lexical[...] "syntax role modifiers" which would be exactly right in this case. We would leave the default on syntax and then we'd write:

A* f(A* l) = ... ; // for the syntax case, which will match on the separated layout list
lex[A*] f(lex[A*] l) = ... // for the lexical case

More importantly, generic functions can that also be disambiguated:

int size(&T* l) = ...
int size(lex[&T*] l) = ...
int reverse(lex[{&Elem &Sep}+]) = ...

Albeit a bit of a mouthful, I don't see another way. We need to specify which syntax role lists and generic lists have if we want some level of type correctness.

@jurgenvinju
Copy link
Member Author

@PaulKlint I will try and finish this in the coming days. I don't think it's relevant for you (except if I break things) in the short run, but it would be good if you read along with this in order to predict necessary fixes for the checker or compiler later. This is used to be a blind spot in our typing rules: i.e. what about {&T &U}* ?

This stuff is necessary to make the code around concrete syntax for external parsers work smoothly. Otherwise, we'll have to write a lot more code in Java, which I'm trying to avoid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant