Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing issue #2009 #2010

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions src/org/rascalmpl/library/lang/rascal/tests/concrete/Character.rsc
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,41 @@ test bool charClassOrderedRanges() = (#[a-z A-Z]).symbol == \char-class([range(6
test bool charClassMergedRanges() = (#[A-Z F-G]).symbol == \char-class([range(65,90)]);
test bool charClassExtendedRanges() = (#[A-M N-Z]).symbol == \char-class([range(65,90)]);

test bool asciiEscape() = \char-class([range(0,127)]) == #[\a00-\a7F].symbol;
test bool utf16Escape() = \char-class([range(0,65535)]) == #[\u0000-\uFFFF].symbol;
test bool utf24Escape() = \char-class([range(0,1114111)]) == #[\U000000-\U10FFFF].symbol;
test bool highLowSurrogateRange1() = \char-class([range(9312,12991)]) == #[①-㊿].symbol;
test bool highLowSurrogateRange2() = \char-class([range(127829,127829)]) == #[🍕].symbol;
test bool differentEscapesSameResult1() = #[\a00-\a7F] == #[\u0000-\u007F];
test bool differentEscapesSameResult2() = #[\a00-\a7F] == #[\U000000-\U00007F];

/* to avoid a known ambiguity */
alias NotAZ = ![A-Z];

test bool unicodeCharacterClassSubtype1() {
Tree t = char(charAt("⑭", 0));

if ([①-㊿] circled := t) {
assert [⑭] _ := circled;
assert NotAZ _ := circled;
return true;
}

return false;
}

test bool unicodeCharacterClassSubtype2() {
Tree t = char(charAt("🍕", 0));

if ([🍕] pizza := t) {
assert [\a00-🍕] _ := pizza;
assert NotAZ _ := pizza;
return true;
}

return false;
}

// ambiguity in this syntax must be resolved first
//test bool differenceCC() = (#[a-zA-Z] - [A-Z]).symbol == (#[a-z]).symbol;
//test bool unionCC() = (#[a-z] || [A-Z]).symbol == (#[A-Za-z]).symbol;
Expand Down
43 changes: 21 additions & 22 deletions src/org/rascalmpl/values/parsetrees/SymbolFactory.java
Original file line number Diff line number Diff line change
Expand Up @@ -337,31 +337,30 @@

private static IValue char2int(Char character) {
String s = ((Char.Lexical) character).getString();
if (s.startsWith("\\")) {
if (s.length() > 1 && java.lang.Character.isDigit(s.charAt(1))) { // octal escape
// TODO
throw new NotYetImplemented("octal escape sequence in character class types");
}
if (s.length() > 1 && s.charAt(1) == 'u') { // octal escape
// TODO
throw new NotYetImplemented("unicode escape sequence in character class types");
}
char cha = s.charAt(1);
if (s.matches("\\\\[auU][0-9A-F]+")) {
DavyLandman marked this conversation as resolved.
Show resolved Hide resolved
// ascii escape (a), utf16 escape (u) or utf24 escape (U)
return factory.integer(Integer.parseInt(s.substring(2), 16));
}
else if (s.startsWith("\\")) {
// builtin escape
int cha = s.codePointAt(1);

Check warning on line 346 in src/org/rascalmpl/values/parsetrees/SymbolFactory.java

View check run for this annotation

Codecov / codecov/patch

src/org/rascalmpl/values/parsetrees/SymbolFactory.java#L346

Added line #L346 was not covered by tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes 👍

switch (cha) {
case 't': return factory.integer('\t');
case 'n': return factory.integer('\n');
case 'r': return factory.integer('\r');
case '\"' : return factory.integer('\"');
case '\'' : return factory.integer('\'');
case '-' : return factory.integer('-');
case '<' : return factory.integer('<');
case '>' : return factory.integer('>');
case '\\' : return factory.integer('\\');
case 't': return factory.integer('\t');
DavyLandman marked this conversation as resolved.
Show resolved Hide resolved
case 'n': return factory.integer('\n');
case 'r': return factory.integer('\r');
case '\"' : return factory.integer('\"');
case '\'' : return factory.integer('\'');
case '-' : return factory.integer('-');
case '<' : return factory.integer('<');
case '>' : return factory.integer('>');
case '\\' : return factory.integer('\\');
default: return factory.integer(cha);

Check warning on line 357 in src/org/rascalmpl/values/parsetrees/SymbolFactory.java

View check run for this annotation

Codecov / codecov/patch

src/org/rascalmpl/values/parsetrees/SymbolFactory.java#L348-L357

Added lines #L348 - L357 were not covered by tests
}
s = s.substring(1);
}
char cha = s.charAt(0);
return factory.integer(cha);
else {
// just a single character (but possibly two char's)
return factory.integer(s.codePointAt(0));
}
}

public static IConstructor charClass(int ch) {
Expand Down
Loading