-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I972: Clearer logging and reporting of judging results #990
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Modify scripts to generate information to help a human judge figure out why a run failed. Better determination of which CPU to run a submission on in a sandbox. Add some basic Web page scripts to make judge results web pages on each judge machine. Add a configurable execute folder that allows substitute variables. This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example. CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier. Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time. Allow display of reports/testcase_xxx.log by link Make all links open a new tab as opposed to overwriting the current one. Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating. Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file. Correct the "whats this" message for the judge execute folder pattern. Add jquery for sorting tables.
Modify scripts to generate information to help a human judge figure out why a run failed. Better determination of which CPU to run a submission on in a sandbox. Add some basic Web page scripts to make judge results web pages on each judge machine. Add a configurable execute folder that allows substitute variables. This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example. CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier. Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time. Allow display of reports/testcase_xxx.log by link Make all links open a new tab as opposed to overwriting the current one. Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating. Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file. Correct the "whats this" message for the judge execute folder pattern. Add jquery for sorting tables.
The main web server machine will now pull all submissions from each judge and show them on one page. This is done using a python script that reads a json result from each judge. Added link to display compiler results on compile error Added link for source code of submission. Change Run Out and Run err to Run stdout and Run stderr
Refactor name of judgesExecuteFolder to be just ExecuteFolder since it applies to anyone who can judge. Read the execute-folder property from the system.pc2.yaml file for CDP configuration. Export the execute folder when the contest is exported. Add error logging to indicate why a compile failed Prevent null references (these were harmless, but caused exception traces in the log files)
If the SubmitSampleRunsPane was never used, and a run changed, an NPE would result when the IRunListener methods were called. Simply check the value of offending member, submissionList.
If the SubmitSampleRunsPane was never used, and a run changed, an NPE would result when the IRunListener methods were called. Simply check the value of offending member, submissionList.
CI: The substituteAllStrings method could possibly return an empty string in the future. Always copy the original string into the returned string before doing anyway instead of relying on the first substitute to do it.
Cleaned up interactive script. Made interactive script configurable so we can change the way it judges RTE/WA/AC to match DOMjudge, if necessary. Fixed combined runs webpage to sort runs from all judges.
This script file was inadvertently left in the previous commit. It is not used.
Modify scripts to generate information to help a human judge figure out why a run failed. Better determination of which CPU to run a submission on in a sandbox. Add some basic Web page scripts to make judge results web pages on each judge machine. Add a configurable execute folder that allows substitute variables. This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example. CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier. Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time. Allow display of reports/testcase_xxx.log by link Make all links open a new tab as opposed to overwriting the current one. Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating. Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file. Correct the "whats this" message for the judge execute folder pattern. Add jquery for sorting tables.
Modify scripts to generate information to help a human judge figure out why a run failed. Better determination of which CPU to run a submission on in a sandbox. Add some basic Web page scripts to make judge results web pages on each judge machine. Add a configurable execute folder that allows substitute variables. This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example. CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier. Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time. Allow display of reports/testcase_xxx.log by link Make all links open a new tab as opposed to overwriting the current one. Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating. Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file. Correct the "whats this" message for the judge execute folder pattern. Add jquery for sorting tables.
The main web server machine will now pull all submissions from each judge and show them on one page. This is done using a python script that reads a json result from each judge. Added link to display compiler results on compile error Added link for source code of submission. Change Run Out and Run err to Run stdout and Run stderr
Refactor name of judgesExecuteFolder to be just ExecuteFolder since it applies to anyone who can judge. Read the execute-folder property from the system.pc2.yaml file for CDP configuration. Export the execute folder when the contest is exported. Add error logging to indicate why a compile failed Prevent null references (these were harmless, but caused exception traces in the log files)
CI: The substituteAllStrings method could possibly return an empty string in the future. Always copy the original string into the returned string before doing anyway instead of relying on the first substitute to do it.
Cleaned up interactive script. Made interactive script configurable so we can change the way it judges RTE/WA/AC to match DOMjudge, if necessary. Fixed combined runs webpage to sort runs from all judges.
This script file was inadvertently left in the previous commit. It is not used.
To clean up the code in Executable, make use of the CommandVariableReplacer class to do substitutions for executeFolder. Fix "Whats this" for the execute folder so it has all the available substitute vars for execute folder. Add JUnit to test new execute folder substitute method.
Added ability to control whether all test cases are checked, or just the most recent. The theory is it will be faster to check just the most recent. This may be the case with large numbers of test cases.
kkarakas
approved these changes
Aug 6, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the code looks good to me.
Added apache sample site config file. Added README to explain how to setup apache Changed the way it looks up the autojudges - now uses /etc/hosts
The message that is put in the log file was not accurate. It was missing an argument so the user could duplicate the results of the sandbox run. This is an echo message only change for the log.
With lots of submissions, the web page refresh was taking on the order of 6 seconds per judge machine. Add caching. It now takes about 2-3 seconds total. Only computes judgment values if the execute folder has changed. Remembers last time it changed in CACHEDIR.
The message telling the judge how to run a judgment by hand specified the absolute path of the autojudge, not the current user. This has been fixed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of what the PR does
While it may seem like this PR is huge, it really is not. The idea was to make the job of the Shadow CCS team easier by exposing more details about why interactive problem judgments differ from a primary. This mostly involves changes to things external to PC^2 proper. That is, very little Java code changed, it's mostly the sandbox scripts (which is where the main shortcomings were) and a bunch of NEW web scripts that were added to support shadow, and, consequently, real judging as well.
The only PC^2 Java code changes are:
ContestInformation
for the execute folder instead of the hardcodedexecutesiteXjudgeY
. The new code allowed substitute strings to be used - see alsoExecutable.java
:runnumber
,:problemletter
,:problemshort
, etc. - see the diffs inExecutable.java
)SnakeContestSnakeYAMLLoader
to allow configuration of the execute folderContestInformationPane
(Settings tab) to allow GUI configuration of execute folderexecutesite{:clientsite}{:clientname}
, eg.executesite1judge3
Executable
now creates a new file in the execute folder of the formexecutedata.X.txt
which contains details about each test cases' disposition. This was added to support the Web page generation. An example of one such file is shown below:A bunch of scripts and files were added to the (new)
support/judge_webcgi
folder. The scripts (in thecgi-bin
sub-folder) are intended to be run as CGI from a OTS Webserver (such as Apache). The scripts will generate web pages that allow a CCS Shadow team member to easily review runs and individual test cases for a run. Detailed logs are also available by clicking a link on a web page. Setting up Apache is beyond the scope of this blurb. But the structure undersupport/judge_webcgi
can be used as a document root, andsupport/judge_webcgi/cgi-bin
can be used as the cgi-bin folder after enabling the CGI module in Apache.For approving this PR, it is not necessary to understand or look at the files in the
support/judge_webcgi
. Interested readers may do so (and probably should) and make comments, but any issues or changes desired in these scripts should NOT preclude approval of this PR. The important things for the PR are the things in thescripts
folder as well as the Java modules that changed to support a configurable execute folder.Issue which the PR addresses
Fixes #973
Fixes #965
Environment in which the PR was developed (OS,IDE, Java version, etc.)
Windows 11
Ubuntu 22.04 (with Apache2 V2.4.52)
Java 17.0.5
Java version "1.8.0_321"
Several contests were replayed through a version of PC2 with this PR and they all worked flawlessly.
PacNWSpring 2024, GNY2023, WF47 Finals
Precise steps for testing the PR (i.e., how to demonstrate that it works correctly)
execute-folder
: property.ex_{:runnumber}_{:problemletter}_{:problemshort}_{:teamid}_{:languageid}_{:clientname}
This will create a unique execute folder for each run with detailed information in the folder name. eg.
ex_990_F_itemselection-1_110_cpp_judge3