-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test automation #44
Conversation
3c71e71
to
48e6304
Compare
48e6304
to
80c53bc
Compare
61325c3
to
5ae26d8
Compare
6b61599
to
04904c4
Compare
I pulled your branch to investigate the error, but I ran into this:
which is caused by ant giving a directory name to Is that branch fully updated? |
Indeed, EncodingTester hasn't been adapted yet to take a directory, though the others have ( see 62baf8a ) |
For the record, I was able to execute the tests without the |
Yes — I changed the version locally, and removed the
OK, went ahead and made that change too. Pushed the updates to this branch. Thanks again very much. |
Well I just now see that dropping https://github.com/validator/htmlparser/pull/44/checks?check_run_id=1015950576 Maybe I should just drop Java 8 from the test matrix? Or do we want to be testing Java 8? |
If you test with Java 8, ant is going to look for Testing with version 11 and then the latest JDK (14 currently) makes sense, Java 8 is possibly unnecessary. |
Incidentally, in css4j I'm compiling my tests for That bug is caused by using test classes in one module from the test classes in another module (yes the |
1b971e8
to
016b9ce
Compare
So as y’all may have noticed, I changed the code in this PR branch to remove the kludgey AntRun additions I’d made, and instead use @anthonyvdotbe’s That approach currently requires adding an extra config setting, to teach Surefire where the That config setting would be unnecessary if/when we end up moving to a new directory layout that follows Maven conventions. But after we do ever get there, the POM can be updated to drop the extra config setting. @anthonyvdotbe Can you confirm you’re agreeing to license the If so, do you want to have a Or are you instead OK with the license header having only the |
Yes, I confirm that I agree.
I confirm that I'm waiving my right to have my name included in the copyright statement. |
The conversion of all the tests (including |
016b9ce
to
2317de9
Compare
f6a787f
to
d84dbbb
Compare
This change causes com.sun.tools to be included as a dependency only if the JDK version is 1.8. Otherwise, without this change, the build fails when run under any Java version later than Java 8.
This change causes the Maven AntRun plugin to invoke the javac and java commands with fork=true when run in a Java 8 environment. Otherwise, without this change, the build fails when run under Java 8.
This change updates the TokenizerTester code to expect its input test data to have the string "Data state" to identify PCDATA tests — rather than the string "DATA state". The test data in the html5lib-tests suite uses "Data state", so without this change, running TokenizerTester against html5lib-tests causes TokenizerTester to fail with a “Broken test data” harness failure.
This change makes TokenizerTester, TreeTester, and EncodingTester exit with status code 1 if any test cases fail. Otherwise, without this change, we won’t catch the test failures when running tests under CI.
This change makes TokenizerTester, TreeTester, and EncodingTester stop emitting the word “Success” to standard error every time a test passes. Otherwise, without this change, in test output, we end up with so many “Success” lines that the actual test failures are effectively obscured (you have to scroll back through the output/log to hunt for failures).
This change makes the sniffing limit in HtmlInputStreamReader settable. Without this change, the HtmlInputStreamReader sniffing limit is hardcoded to 1024 — and in the context of testing, that has the effect of limiting HtmlInputStreamReader to only being useful for testing expected output of the meta prescan. So this change makes it possible for HtmlInputStreamReader to also be used for testing the results for the state where the expected character encoding is not limited to what can be determined by checking the first 1024 bytes of the input stream.
This change updates EncodingTester to make it test the result for cases when the expected character encoding is not limited to what can be determined by checking only the first 1024 bytes of the input stream. Otherwise, without this change, EncodingTester is limited to only being useful for testing the output of the meta prescan.
This change makes the TokenizerTester(InputStream stream) constructor public — as the corresponding constructors for TreeTester and EncodingTester already are.
This change refines how Html5libTest handles filenames; it adds a mechanism to allow a required/expected file extension to be specified for each test type, and uses the mechanism to specify that ".test" is the required/expected extension for tokenizer tests, and that ".dat" is the required/expected extension for tree-construction test files and for encoding test files. Without this change, Html5libTest only deals correctly with the ".test" extension for tokenizer test files — but not with the ".dat" extension for tree-construction test files and encoding test files.
This change makes Html5libTest correctly handle tests in the html5lib-tests suite which have cases with so-called “double-escaped” “input” and “output” values — for example, values that contain the literals “\\u0000” and “\\uFFFD" rather than “\u0000” and “\uFFFD”.
This change replaces Path.of() calls in Html5libTest with Paths.get(). Per https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/file/Path.html#of(java.net.URI) Path.of() was introduced in Java 11. So Java 8 has no Path.of(); see also https://docs.oracle.com/javase/8/docs/api/java/nio/file/Path.html We need to continue to support Java 8 for the time being. It seems Paths.get() will eventually end up being formally deprecated; by the time it finally is, we may also be able to quit supporting Java 8 — and so we can just switch to Path.of() then.
d84dbbb
to
390d9a9
Compare
I see that you re-enabled testing for Java 8, which is a good thing (commits 05b9024 and a27c7b6), but using Java 8 is unlikely to get along very well with the current modularization patches (as long as there is the requirement that the minimum JDK to process the POM is 11). And running automated testing with Java 11 or higher is going to be a quite different game once the project is modularized. It's not that you cannot do either of those things (you can even use Java 8 if the POM is complicated enough) but modularization is highly coupled with testing, and setting up an automated testing infrastructure without accounting for modules has the potential of being a waste of time. IMHO either this PR or the modularization PR should be merged as soon as possible (BTW after making any necessary changes), and then the other PR can rebase on top of it. Yes I assume that you can't do a lot either way, just making sure that you are aware of this. |
@carlosame Thanks for the heads-up about the issue dealing with Java 8 in the face of the modularization patches. It’d be pretty disappointing if Maven, in all its baroque complexity, didn’t provide a way to build a modularized jar for current Java versions while also remaining able to ensure the code at least continues to compile under Java 8. I personally am not willing to cavalierly ignore all the users who possibly might be constrained to needing to use the parser in a Java 8 environment. I don’t have a way to measure how many such users the parser might have. For all I know, it could be there are so few users of the parser that need Java 8 compatibility that we don’t need to worry about it. Or it could be that there are some important users for whom we’d be abandoning when we rightly shouldn’t be. But if we can’t even ourselves continue to have an automated way to test that the code continues compiles under Java 8, then we have no way to know when some code change we made has broken compatibility with Java 8. |
On further reflection, it occurs to me to ask: As far as building with Maven goes, couldn’t we just have one pom.xml for building/testing the “normal” build for Java 11+, and then a second pom.xml for building/testing the Java 8 build? |
Hmm that's not what I was meaning. The modularization patches in PR #43 generate bytecode for Java 8 (IMHO should be 7) except for the Actually, your current patches in this PR are always testing Java 8 bytecode, you are just producing and testing that level-8 bytecode with different JDKs. And source-level compatibility is set to 8 as well. |
Additional clarification to my previous post: the idea is that Java 8 users will be able to use the compiled jar files, but won't be able to build them. A modular JDK is required to build. If you still want to test with Java 8 and want to avoid undesirable POM hacks, you could set up a separate test project that would depend on a level-8 test-jar produced by the main build. But I see no real reason to do that. |
I believe everything in this PR branch is now also in the main branch — so I’m going ahead and closing this. |
The changes in this pull-request branch enable us, for all pull requests and pushes, to automate testing of the parser against the html5lib-tests suite — including CI execution (using GitHub Actions) of the tests .
The changes include the following commits:
…along with some related minor changes to get things working smoothly.