-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mysql/oracle] Add remaining ports to support all targets. #4345
Conversation
…, so Test.* can be generated/removed. Add Dart port.
…lasses where it belongs.
…xer and parser in TypeScript port.
…n a class, so enums have to be global! What a screwed up OO language.
…rewrite. Fix Go target (not working yet).
It should probably not be done this way, but this does work, and the properties for indices, line and column, and text are now all correct. *Original source code by Mike has bugs.*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You put a lot of work into this. Are you planning to do the same for all grammars? That could be a job that takes several years :-)
Very nice to see the performance charts! Interesting numbers and interesting to see that Dart does better than C++ (although only by a small margin). About the TypeScript code: this was meant to be test/example code. The users of the grammar have to implement the base classes on their own anyway, to match their environment. I think it makes not so much sense to port this demo code to all supported targets. And to serve as a model/template a single implementation is all what's needed. In any case the effort to make this work cross-target is enormous and I want to get rid of this kind of target specific actions altogether in my TS port of ANTLR4. From today's perspective it was a bad decision to allow native code go into a grammar, but hey, this is how things go sometimes. |
Yes, there's a lot of work to do here to make ports for every split grammar. But, I think time would be better spent on understanding how to detect and remove grammar ambiguity and fallbacks, and apply it to the TIOBE-rated programming language grammars. Antlr is wonderful in accepting all sorts of bad grammars, but it also is why people tend to give it a bad rap. (You should see the trash-talk about Antlr in Reddit.) Looking forward to Antlrng picking up where Antlr4 stops. |
@kaby76 thanks! |
This PR implements the rest of the ports of the mysql/oracle grammar. This is a first step in generating each of the ports automatically from the official source (here).
These ports are necessary because people want to work with a grammar with a minimum of effort. Asking them to implement a port of the grammar is time-consuming and difficult.
In fact, this PR was difficult. The implementation of the two tokens emitted with DOT_IDENTIFIER rule implementation was outrageous because every target has its idiosyncrasies. (It probably would have been easier to have "emitDot()" emit both the DOT_SYMBOL and IDENTIFIER. I think I've seen this issue in other grammars.) The use of base class methods in the Antlr4ng source is inconsistent. (Actions need to be implemented as base class methods for target-agnostic.)
Previously, from the original Antlr4ng port, initialization was done in the driver:
grammars-v4/sql/mysql/Oracle/original/TypeScript/demo.ts
Lines 23 to 27 in 00a96c9
This is actually a bad place to put initialization because it requires the user to read the driver code or readme. Most people who use this repo don't do either. Driver codes are not part of the grammar. Initialization should be done in a constructor. NB: The Go port required a workaround.
What was done?
init()
defined to initialize the static var.Test.*
files were removed.StackQueue<>
was removed and all ports now use a standard queue data structure.Performance
As I mentioned in a discussion, I have been writing scripts to output a graphical comparison of the relative speed across targets. Here is the graph for this grammar.
data.zip
As mentioned previously, the large variance in the times for Java are due to anti-virus and disk caching for the first run of the parser app. (There are over 800 .class files generated by the Java compiler!)
To do
charSets
is still not implemented.@mike-lischke This is a good first step in getting the ports generated via a script from the source. This grammar in target-agnostic format works fine. As with all grammars in this repo, Python3 is pretty slow. But, it does work.