-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial unicode character support for identifiers and whitespace
Summary: Test Plan: Added a test Reviewers: Subscribers: Tasks: Tags:
- Loading branch information
Showing
7 changed files
with
25,375 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Concatenate all the fragments into a .jj file. | ||
gendir='../target/generated-sources/javacc' | ||
mkdir -p $gendir | ||
cat javacc-options-java.txt nonreservedwords.txt reservedwords.txt sql-spec.txt presto-extensions.txt lexical-elements.txt > $gendir/parser_tmp.jjt | ||
cat javacc-options-java.txt nonreservedwords.txt reservedwords.txt sql-spec.txt presto-extensions.txt unicode-identifier-start.txt unicode-identifier-extend.txt ws.txt lexical-elements.txt > $gendir/parser_tmp.jjt |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
TOKEN: | ||
{ | ||
<#Zl: [ | ||
"\u2028" //LINE SEPARATOR;Zl;0;WS;;;;;N;;;;; | ||
]> | ||
|
||
| <#Zp: [ | ||
"\u2029" //PARAGRAPH SEPARATOR;Zp;0;B;;;;;N;;;;; | ||
]> | ||
|
||
| <#Zs: [ | ||
"\u0020" //SPACE;Zs;0;WS;;;;;N;;;;; | ||
, "\u00A0" //NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;; | ||
, "\u1680" //OGHAM SPACE MARK;Zs;0;WS;;;;;N;;;;; | ||
, "\u2000" //EN QUAD;Zs;0;WS;2002;;;;N;;;;; | ||
, "\u2001" //EM QUAD;Zs;0;WS;2003;;;;N;;;;; | ||
, "\u2002" //EN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2003" //EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2004" //THREE-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2005" //FOUR-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2006" //SIX-PER-EM SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2007" //FIGURE SPACE;Zs;0;WS;<noBreak> 0020;;;;N;;;;; | ||
, "\u2008" //PUNCTUATION SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u2009" //THIN SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u200A" //HAIR SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u202F" //NARROW NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;;;;; | ||
, "\u205F" //MEDIUM MATHEMATICAL SPACE;Zs;0;WS;<compat> 0020;;;;N;;;;; | ||
, "\u3000" //IDEOGRAPHIC SPACE;Zs;0;WS;<wide> 0020;;;;N;;;;; | ||
]> | ||
|
||
| <#UnicodeWhiteSpace: (<Zl> | <Zp> | <Zs>)> | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters