Fairness, validity, and legal interpretation #36
adamavenir
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
What constitutes “fairness” in Code4rena?
I’ve seen this question come up often enough recently that I’d like to lay out a few thoughts on the topic.
Fundamental principles
These are the fundamental principles that I propose should underly how we look at the question of “fairness” in Code4rena.
Expectations of participants
Role of hired staff (Code4 Corporation)
The role of staff is regulatory, supportive, and administrative:
Ambiguity and evolution
In all cases of ambiguity that exist within rules and the application of those rules (including tools provided), judges have been and will be called upon to make judgment calls which some set of the community may disagree with.
Making these decisions is the most important and difficult part of their role.
There are generally two poles of legal interpretation:
Both apply interpretation of “intent” and “meaning”. People have their preferences and are entitled to those, but both ends of the spectrum are logically valid.
Given the nascence of Code4rena as a completely new idea, judges have been encouraged to take a liberal construction approach given so much of the ‘law’ of C4 is not documented, but must be interpreted based on context and intent.
Early on, we decided that judges would be given absolute discretion to allocate awards based on their view of wardens’ comparative performance using the tools, documentation, and methodology provided.
At the same time, Code4 (staff) are constantly trying to make improvements to tools, documentation, and process which allow for the continued scaling of the community and platform.
As Code4rena scales and as more wardens have joined who do not have full historical context, it becomes clear that reliance on that undocumented context and intent is insufficient for purposes of fairness.
It is increasingly important to codify consensus on how decisions are and will be made so that wardens have the ability to make their best decisions.
What constitutes ‘valid’ and ‘consistent’?
The validity of an audit report submission is not based on whether it is ‘true’ or not. A report may contain a finding which is factually 'true' (the most literal interpretation of 'valid'), but if it does not add value or if it is not presented in such a way that adds value to a sponsor, it may be deemed invalid by a judge.
This may seem harsh and exclusive, but it is essential to consider that Code4rena runs audit contests, not gotcha-hunts, and Code4rena offers guaranteed payout for valid submissions. This means that wardens are providing a service to sponsors and the product of those services should meet what judges feel is a minimum standard in order to be deemed of value.
Auditing is serious, disciplined work that should provide high value consultative expertise to the people paying for the work.
In that light, judges are right to have high standards. Some judges have always had higher standards than others, and some judges have applied higher standards in later contests than they did in earlier ones.
While this may be seen as ‘inconsistent’, it is also true that standards within a specific contest will always be informed by the overall quality of a contest’s submissions, and that the standard in a judge’s mind is always going to be evolving based on the aggregate quality of submissions that judge has been exposed to and the decisions other judges have made.
The correct assessment when this happens is not that a judge is being inconsistent, it is that they have objectively observed that the quality of competition has increased, and that observation shapes their view of the whole set of submissions; they are consistent in valuing submissions in the context of each other, which is a central way that performance in a competition is measured.
Continued evolution of rules
Rubric
Because wardens should be able to have clear rule expectations of contests they contribute to, and because newer wardens do not have historical context on the intent of various rules, it is important that we continue to document a rubric of what constitutes the subjective threshold of validity.
An initial rubric has been outlined here and a finalized version of this rubric will soon be added to formal documentation and judging procedure.
Note well:
Limiting the scope of judges’ role
Since day 1 of C4, the website and docs has described the judge's role as:
In this, combined with judges’ say being final, it is not unreasonable to interpret judges as being weighted with the task of doing their best to make a fair judgment on a contest's award allocation, causing them at times to question the tooling and process and whether they are correctly interpreting the spirit of the law.
This language puts too much burden on judges and it's going to be eliminated in favor of the more explicit:
This ensures that award amounts themselves cannot be brought into the discussion or the judges feel compelled by circumstances to have an opinion on whether the award allocation itself is fair; they are simply making a determination and rating warden’s submissions based on severity, validity, and quality.
We will henceforth make it clear that when judges encounter a nuanced case where the letter of the law is being followed by a warden’s submission but seems to be gamed against the spirit of the law, judges have a right to indicate that in their assessment of a warden’s submission, but will be bound by the letter of the law.
At the same time, wardens must note that codifying the above mentioned rubric as a part of our rules and documentation will significantly expand judges’ ability to assess findings as invalid.
We will also ensure that any changes to rules and documentation made by Code4 are only effective for contests that have not yet started. However, note well that both announcements of clarifications of rules interpretation and application of the results of those rules will not ever be considered 'changes' to rules. If a rule or guideline exists and has only been weakly suggested and not strictly enforced, it may be applied or enforced to a greater extent in any contest with zero notice since the rule was already fully communicated.
As a result, wardens should be able to increasingly rely on judges taking a “strict construction” (“letter of the law”) method of interpretation of the rules.
Further clarification notes
We will be working to continue to document how wardens can expect judges will interpret the meanings of “severity”, “validity”, and “quality” and we encourage wardens to raise issues where they feel the rubric on these is unclear.
While the QA award curve’s asymptotic floor could be seemingly interpreted to pay regardless of content (and in fact, it seems the coded implementation of the curve did just that), but the intent was not for invalid, low-quality, and/or solely non-critical submissions to receive guaranteed pay. Judges may have applied this inconsistently either through a lack of consistency in their process or a lack of understanding of the tool, but going forward these types of submissions will not be awarded.
If you benefited from this oversight in the past, count yourself lucky! But when future outcomes deviate from past precedent on this, understand that awarding invalids at the asymptotic floor was not the intent of the curve.
Finally, wardens should also take note of recent announcements regarding:
(These two items are in the #wardens channel.)
Beta Was this translation helpful? Give feedback.
All reactions