From d78522fe229f5815c175db183b4df92e01c6f5ba Mon Sep 17 00:00:00 2001 From: evouga Date: Wed, 18 Sep 2024 17:04:49 -0500 Subject: [PATCH 1/8] Split score.txt into bounded and unbounded versions --- spec/2023-07-draft.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 55c5b0c..cd9ef80 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1119,28 +1119,30 @@ It is a judge error if `S > M`. This formula evenly distributes a group's leftov The score of a failed test case is always 0. By default, the score of an accepted test case is its maximum score, computed as described above. -A custom output validator may produce a `score.txt` file for a test case: +A custom output validator may produce a `score.txt` or `unbounded_score.txt` file for a test case: - for test cases in a group with bounded maximum score, `score.txt` must contain a single floating-point number in the range `[0,1]`. The score of the test case is this number _multiplied_ by the test case maximum score. - -- for test cases in unbounded groups, `score.txt` must contain a non-negative floating-point number. +- for test cases in unbounded groups, `unbounded_score.txt` must contain a non-negative floating-point number. The score of the test case is that number. -It is a judge error if an output validator accepts a test case in an unbounded group and does not produce a `score.txt`. -It is also a judge error if an output validator produces a `score.txt` for a test case in a group with `passs-fail` aggregation. +It is a judge error if: +- an output validator accepts a test case in an unbounded group and does not produce an `unbounded_score.txt`; +- an output validator produces a `score.txt` for a test case in a group with `passs-fail` aggregation or with unbounded maximum score; +- an output validator produces a `unbounded_score.txt` for a test case in a group with bounded maximum score. ### Scoring Test Groups The score of a test group is determined by its subgroups and test cases. If it has no subgroups or test cases, then its score is 0. Otherwise, the score depends on the aggregation mode, which is either `pass-fail`, `sum`, or `min`. -If a group uses `pass-fail` aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation. + +- If a group uses `pass-fail` aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation. If the submission receives an accept verdict for all test cases in the group and its subgroups, the score of the group is equal to its maximum possible score. Otherwise the group score is 0. -If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups. -If a group uses `min` aggregation, then the group score is the minimum of these scores. +- If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups. +- If a group uses `min` aggregation, then the group score is the minimum of these scores. The submission score is the score of the `secret` group. From 138b80fd4663b377b5f28fc0da6d298c70ccc687 Mon Sep 17 00:00:00 2001 From: evouga Date: Wed, 18 Sep 2024 17:15:37 -0500 Subject: [PATCH 2/8] Fix errors in maximum score inference table --- spec/2023-07-draft.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index cd9ef80..2500bce 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1109,11 +1109,11 @@ as is the maximum score of each test case in the group: Group Maximum Score | Aggregation Type | Maximum Score of Test Case / Subgroup ------------------- | -------------------- | ------------------------------------- `unbounded` | any | `unbounded` -bounded value `M` | `sum` or `pass-fail` | `(M - S)/(A + T)` -bounded value `M` | `min` | `M - S` +bounded value `M` | `pass-fail` | 1 +bounded value `M` | `sum` | `(M - S)/(A + T)` +bounded value `M` | `min` | `M` -where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`. -It is a judge error if `S > M`. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score. +where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score. It is a judge error if `S > M` for a group with bounded maximum score and `sum` aggregation. ### Scoring Test Cases From 6b17c2edf73c397a8f0ff44e4c628f0c7d07057d Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 20:26:42 -0500 Subject: [PATCH 3/8] Update spec/2023-07-draft.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Fredrik Niemelä --- spec/2023-07-draft.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 2500bce..5ccb265 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1113,7 +1113,9 @@ bounded value `M` | `pass-fail` | 1 bounded value `M` | `sum` | `(M - S)/(A + T)` bounded value `M` | `min` | `M` -where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score. It is a judge error if `S > M` for a group with bounded maximum score and `sum` aggregation. +where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`. +This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score. +It is a judge error if `S > M` for a group with bounded maximum score and `sum` aggregation. ### Scoring Test Cases From 35ccf3e0abd165ffe72ad5efeee72b3d70aa92dc Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 20:42:04 -0500 Subject: [PATCH 4/8] Default maximum score inference now only applies to sum or min aggregation Error if score exceeds max score --- spec/2023-07-draft.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 678ecda..1de9034 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1103,18 +1103,17 @@ The default value of `aggregation` is `sum` for the `secret` group and `pass-fai #### Maximum Score Inference -The `secret` group, its subgroups, and every test case in these groups have a maximum possible score. +Groups and subgroups with `sum` or `min` aggregation and every test case in these groups have a maximum possible score. The `secret` group's score may be any positive integer or `unbounded`. Subgroups of `secret` may only have `unbounded` maximum score if `secret` is unbounded. The default value of `score` for the `secret` group is 100. -The default `score` for other test data groups is inferred from the `score` value of its parent and siblings, +The default `score` for other test data groups with `sum` or `min` aggregation is inferred from the `score` value of its parent and siblings, as is the maximum score of each test case in the group: -Group Maximum Score | Aggregation Type | Maximum Score of Test Case / Subgroup +Group Maximum Score | Aggregation Type | Default Maximum Score of Test Case / Subgroup ------------------- | -------------------- | ------------------------------------- -`unbounded` | any | `unbounded` -bounded value `M` | `pass-fail` | 1 +`unbounded` | `sum` or `min` | `unbounded` bounded value `M` | `sum` | `(M - S)/(A + T)` bounded value `M` | `min` | `M` @@ -1151,7 +1150,7 @@ Otherwise the group score is 0. - If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups. - If a group uses `min` aggregation, then the group score is the minimum of these scores. -The submission score is the score of the `secret` group. +The submission score is the score of the `secret` group. It is a judge error if the score of any group or subgroup exceeds its maximum score. ### Required Dependent Groups From 5109ad02294a45e5d5c91c8a6b8d386679ba6d0e Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 20:45:13 -0500 Subject: [PATCH 5/8] Also an error if test case score exceeds max score --- spec/2023-07-draft.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 1de9034..e9bc65d 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1135,7 +1135,8 @@ A custom output validator may produce a `score.txt` or `unbounded_score.txt` fil It is a judge error if: - an output validator accepts a test case in an unbounded group and does not produce an `unbounded_score.txt`; - an output validator produces a `score.txt` for a test case in a group with `passs-fail` aggregation or with unbounded maximum score; -- an output validator produces a `unbounded_score.txt` for a test case in a group with bounded maximum score. +- an output validator produces a `unbounded_score.txt` for a test case in a group with bounded maximum score; +- the score of a test case exceeds its maximum score. ### Scoring Test Groups @@ -1150,7 +1151,9 @@ Otherwise the group score is 0. - If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups. - If a group uses `min` aggregation, then the group score is the minimum of these scores. -The submission score is the score of the `secret` group. It is a judge error if the score of any group or subgroup exceeds its maximum score. +The submission score is the score of the `secret` group. + +It is a judge error if the score of any group or subgroup exceeds its maximum score. ### Required Dependent Groups From e03092d288d8297e0aff3ed3878bd97fdfa89ca0 Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 21:05:52 -0500 Subject: [PATCH 6/8] Now uses score.txt and score_multiplier.txt --- spec/2023-07-draft.md | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index e9bc65d..667ec5b 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1123,20 +1123,23 @@ It is a judge error if `S > M` for a group with bounded maximum score and `sum` ### Scoring Test Cases +Only test cases in test case group with `sum` or `min` aggregation receive a score. + The score of a failed test case is always 0. -By default, the score of an accepted test case is its maximum score, computed as described above. -A custom output validator may produce a `score.txt` or `unbounded_score.txt` file for a test case: -- for test cases in a group with bounded maximum score, `score.txt` must contain a single floating-point number in the range `[0,1]`. - The score of the test case is this number _multiplied_ by the test case maximum score. -- for test cases in unbounded groups, `unbounded_score.txt` must contain a non-negative floating-point number. +A custom output validator may produce a `score.txt` or `score_multiplier.txt` file for an accepted test case: + +- for test cases with bounded maximum score, `score_multiplier.txt`, if produced, must contain a single floating-point number in the range `[0,1]`. + The score of the test case is this number _multiplied_ by the test case maximum score. If no `score_multiplier.txt` is produced, the test case score is its maximum score. +- for test cases with unbounded maximum score, `score.txt` must be produced and must contain a non-negative floating-point number. The score of the test case is that number. It is a judge error if: -- an output validator accepts a test case in an unbounded group and does not produce an `unbounded_score.txt`; -- an output validator produces a `score.txt` for a test case in a group with `passs-fail` aggregation or with unbounded maximum score; -- an output validator produces a `unbounded_score.txt` for a test case in a group with bounded maximum score; -- the score of a test case exceeds its maximum score. +- an output validator accepts a test case in an unbounded group and does not produce a `score.txt`; +- an output validator produces a `score_multiplier.txt` for a test case with unbounded maximum score; +- an output validator produces a `score.txt` for a test case with bounded maximum score; +- an output validator produces a `score.txt` or `score_multiplier.txt` for a test case in a group with `pass-fail` aggregation; +- an output valiadtor produces a `score.txt` or `score_multiplier.txt` with invalid contents. ### Scoring Test Groups From 8d50d14d2bbd61511668186a35ff9e9264f3f941 Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 21:07:14 -0500 Subject: [PATCH 7/8] Typo --- spec/2023-07-draft.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 667ec5b..2e7627f 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1123,7 +1123,7 @@ It is a judge error if `S > M` for a group with bounded maximum score and `sum` ### Scoring Test Cases -Only test cases in test case group with `sum` or `min` aggregation receive a score. +Only test cases in test case groups with `sum` or `min` aggregation receive a score. The score of a failed test case is always 0. From 6b3e88a461e36ae72af5065b2fbbac22d25e7863 Mon Sep 17 00:00:00 2001 From: evouga Date: Fri, 20 Sep 2024 21:21:53 -0500 Subject: [PATCH 8/8] Clean up description of which groups/test cases have maximum scores. --- spec/2023-07-draft.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/spec/2023-07-draft.md b/spec/2023-07-draft.md index 2e7627f..97219f7 100644 --- a/spec/2023-07-draft.md +++ b/spec/2023-07-draft.md @@ -1103,13 +1103,12 @@ The default value of `aggregation` is `sum` for the `secret` group and `pass-fai #### Maximum Score Inference -Groups and subgroups with `sum` or `min` aggregation and every test case in these groups have a maximum possible score. +The `secret` group, and every subgroup and test case in a group with `sum` or `min` aggregation, have a maximum possible score. The `secret` group's score may be any positive integer or `unbounded`. Subgroups of `secret` may only have `unbounded` maximum score if `secret` is unbounded. The default value of `score` for the `secret` group is 100. -The default `score` for other test data groups with `sum` or `min` aggregation is inferred from the `score` value of its parent and siblings, -as is the maximum score of each test case in the group: +The default `score` for subgroups and test cases of groups with `sum` or `min` aggregation is inferred from the `score` value of that group and its children: Group Maximum Score | Aggregation Type | Default Maximum Score of Test Case / Subgroup ------------------- | -------------------- | -------------------------------------