Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I972: Clearer logging and reporting of judging results #990

Merged
merged 59 commits into from
Aug 27, 2024

Conversation

johnbrvc
Copy link
Collaborator

Description of what the PR does

While it may seem like this PR is huge, it really is not. The idea was to make the job of the Shadow CCS team easier by exposing more details about why interactive problem judgments differ from a primary. This mostly involves changes to things external to PC^2 proper. That is, very little Java code changed, it's mostly the sandbox scripts (which is where the main shortcomings were) and a bunch of NEW web scripts that were added to support shadow, and, consequently, real judging as well.

The only PC^2 Java code changes are:

  1. spacing/changes introduced by Eclipse (no-ops in terms of functionality) - by far, this is the majority of the code changes.
  2. adding a new configurable string to ContestInformation for the execute folder instead of the hardcoded executesiteXjudgeY. The new code allowed substitute strings to be used - see also Executable.java
  3. a few new substitute strings were added to support the configurable execute folder (:runnumber, :problemletter, :problemshort, etc. - see the diffs in Executable.java)
  4. a couple of CI's to prevent future issues, and, fix old issues.
  5. update SnakeContestSnakeYAMLLoader to allow configuration of the execute folder
  6. update ContestInformationPane (Settings tab) to allow GUI configuration of execute folder
  7. an empty (undefined) execute folder reverts to the old default of: executesite{:clientsite}{:clientname}, eg. executesite1judge3
  8. Executable now creates a new file in the execute folder of the form executedata.X.txt which contains details about each test cases' disposition. This was added to support the Web page generation. An example of one such file is shown below:
executeDateTime='2024-07-18T12:20:22.943-04'
compileExeFileName='a.out'
compileSuccess='true'
compileResultCode='0'
executeExitValue='0'
executeSuccess='true'
validationReturnCode='42'
validationSuccess='true'
validationResults='accepted'
compileTimeMS='1290'
executeTimeMS='91'
validateTimeMS='31'
executionException=''
runTimeLimitExceeded='false'
additionalInformation=''

A bunch of scripts and files were added to the (new) support/judge_webcgi folder. The scripts (in the cgi-bin sub-folder) are intended to be run as CGI from a OTS Webserver (such as Apache). The scripts will generate web pages that allow a CCS Shadow team member to easily review runs and individual test cases for a run. Detailed logs are also available by clicking a link on a web page. Setting up Apache is beyond the scope of this blurb. But the structure under support/judge_webcgi can be used as a document root, and support/judge_webcgi/cgi-bin can be used as the cgi-bin folder after enabling the CGI module in Apache.

For approving this PR, it is not necessary to understand or look at the files in the support/judge_webcgi. Interested readers may do so (and probably should) and make comments, but any issues or changes desired in these scripts should NOT preclude approval of this PR. The important things for the PR are the things in the scripts folder as well as the Java modules that changed to support a configurable execute folder.

Issue which the PR addresses

Fixes #973
Fixes #965

Environment in which the PR was developed (OS,IDE, Java version, etc.)

Windows 11
Ubuntu 22.04 (with Apache2 V2.4.52)
Java 17.0.5
Java version "1.8.0_321"

Several contests were replayed through a version of PC2 with this PR and they all worked flawlessly.
PacNWSpring 2024, GNY2023, WF47 Finals

Precise steps for testing the PR (i.e., how to demonstrate that it works correctly)

  1. Start any contest
  2. Open an administrator
  3. Go to the Settings tab (which is really the ContestInformation pane)
  4. Note the place you can set the execute folder. There is also a "what's this" button there.
  5. Change the execute folder and press Update.
  6. Judge a run
  7. Note the execute folder used is the one that was just configured.
  8. Configuration can also be done in the CDP using the system.pc2.yaml and the execute-folder: property.
  9. An interesting execute folder name is: ex_{:runnumber}_{:problemletter}_{:problemshort}_{:teamid}_{:languageid}_{:clientname}
    This will create a unique execute folder for each run with detailed information in the folder name. eg. ex_990_F_itemselection-1_110_cpp_judge3

johnbrvc added 24 commits July 5, 2024 08:43
Modify scripts to generate information to help a human judge figure out why a run failed.
Better determination of which CPU to run a submission on in a sandbox.
Add some basic Web page scripts to make judge results web pages on each judge machine.
Add a configurable execute folder that allows substitute variables.  This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example.
CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier.
Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time.
Allow display of reports/testcase_xxx.log by link
Make all links open a new tab as opposed to overwriting the current one.
Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating.
Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file.
Correct the "whats this" message for the judge execute folder pattern.
Add jquery for sorting tables.
Modify scripts to generate information to help a human judge figure out why a run failed.
Better determination of which CPU to run a submission on in a sandbox.
Add some basic Web page scripts to make judge results web pages on each judge machine.
Add a configurable execute folder that allows substitute variables.  This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example.
CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier.
Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time.
Allow display of reports/testcase_xxx.log by link
Make all links open a new tab as opposed to overwriting the current one.
Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating.
Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file.
Correct the "whats this" message for the judge execute folder pattern.
Add jquery for sorting tables.
The main web server machine will now pull all submissions from each judge and show them on one page.  This is done using a python script that reads a json result from each judge.
Added link to display compiler results on compile error
Added link for source code of submission.
Change Run Out and Run err to Run stdout and Run stderr
Refactor name of judgesExecuteFolder to be just ExecuteFolder since it applies to anyone who can judge.
Read the execute-folder property from the system.pc2.yaml file for CDP configuration.
Export the execute folder when the contest is exported.
Add error logging to indicate why a compile failed
Prevent null references (these were harmless, but caused exception traces in the log files)
If the SubmitSampleRunsPane was never used, and a run changed, an NPE would result when the IRunListener methods were called.
Simply check the value of offending member, submissionList.
If the SubmitSampleRunsPane was never used, and a run changed, an NPE would result when the IRunListener methods were called.
Simply check the value of offending member, submissionList.
CI: The substituteAllStrings method could possibly return an empty string in the future.  Always copy the original string into the returned string before doing anyway instead of relying on the first substitute to do it.
Cleaned up interactive script.
Made interactive script configurable so we can change the way it judges RTE/WA/AC to match DOMjudge, if necessary.
Fixed combined runs webpage to sort runs from all judges.
This script file was inadvertently left in the previous commit.  It is not used.
@johnbrvc johnbrvc added this to the 9.11.0 milestone Jul 18, 2024
@johnbrvc johnbrvc self-assigned this Jul 18, 2024
johnbrvc added 24 commits August 5, 2024 14:29
Modify scripts to generate information to help a human judge figure out why a run failed.
Better determination of which CPU to run a submission on in a sandbox.
Add some basic Web page scripts to make judge results web pages on each judge machine.
Add a configurable execute folder that allows substitute variables.  This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example.
CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier.
Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time.
Allow display of reports/testcase_xxx.log by link
Make all links open a new tab as opposed to overwriting the current one.
Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating.
Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file.
Correct the "whats this" message for the judge execute folder pattern.
Add jquery for sorting tables.
Modify scripts to generate information to help a human judge figure out why a run failed.
Better determination of which CPU to run a submission on in a sandbox.
Add some basic Web page scripts to make judge results web pages on each judge machine.
Add a configurable execute folder that allows substitute variables.  This is good so that the execute folders may be retained for each run {:runnumber} in the execute folder, for example.
CI: toString method of ExecutionData was not correct.
Ongoing work in making human judging of results easier.
More updates to make judging easier.
Add judge number support {:clientid} to execute folder in scripts
Also added Warning icon
Allow display of sandbox cpu time.
Allow display of reports/testcase_xxx.log by link
Make all links open a new tab as opposed to overwriting the current one.
Add memory used column since it is now supported in latest kernel.
Interactive sandbox script needed cleaning up and updating.
Fix some bash 'if' statements in case arg is empty string.
Remove extra space when generating the execute data file.
Correct the "whats this" message for the judge execute folder pattern.
Add jquery for sorting tables.
The main web server machine will now pull all submissions from each judge and show them on one page.  This is done using a python script that reads a json result from each judge.
Added link to display compiler results on compile error
Added link for source code of submission.
Change Run Out and Run err to Run stdout and Run stderr
Refactor name of judgesExecuteFolder to be just ExecuteFolder since it applies to anyone who can judge.
Read the execute-folder property from the system.pc2.yaml file for CDP configuration.
Export the execute folder when the contest is exported.
Add error logging to indicate why a compile failed
Prevent null references (these were harmless, but caused exception traces in the log files)
CI: The substituteAllStrings method could possibly return an empty string in the future.  Always copy the original string into the returned string before doing anyway instead of relying on the first substitute to do it.
Cleaned up interactive script.
Made interactive script configurable so we can change the way it judges RTE/WA/AC to match DOMjudge, if necessary.
Fixed combined runs webpage to sort runs from all judges.
This script file was inadvertently left in the previous commit.  It is not used.
To clean up the code in Executable, make use of the CommandVariableReplacer class to do substitutions for executeFolder.
Fix "Whats this" for the execute folder so it has all the available substitute vars for execute folder.
Add JUnit to test new execute folder substitute method.
Added ability to control whether all test cases are checked, or just the most recent.  The theory is it will be faster to check just the most recent.  This may be the case with large numbers of test cases.
Copy link
Collaborator

@kkarakas kkarakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through the code looks good to me.

Added apache sample site config file.
Added README to explain how to setup apache
Changed the way it looks up the autojudges - now uses /etc/hosts
The message that is put in the log file was not accurate.  It was missing an argument so the user could duplicate the results of the sandbox run.  This is an echo message only change for the log.
With lots of submissions, the web page refresh was taking on the order of 6 seconds per judge machine.  Add caching.  It now takes about 2-3 seconds total.  Only computes judgment values if the execute folder has changed.  Remembers last time it changed in CACHEDIR.
The message telling the judge how to run a judgment by hand specified the absolute path of the autojudge, not the current user.  This has been fixed.
@johnbrvc johnbrvc merged commit 9bbe5a3 into pc2ccs:develop Aug 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants