-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prerelease Automated Validation Pipeline for Jargon Artifacts #248
Comments
Nice write-up Ash. After a deeper look below (though I'll preface with I'm new to a lot of this so still learning context), my summary is, yes having such a test to ensure we don't publish invalid LDs or Schemas is of course needed, and the outline you've done is mostly fine, but it won't catch valid yet meaningless or confusing output that may still cause problems. My main comment is that I do not think we should be publishing artifacts directly from jargon automatically, even if your new test passes. The pipeline for publishing these generated artifacts should always include requiring a human to review the changes between the previous version and the to-be-published version - not only when the test fails - and storing the approved version in git. This is for two reasons:
That's what I've come away with from trying to understand the problem more deeply. I don't know the jargon tool well, so please correct any mistakes below :) Now for the long version... The problem - UX != codeFor me, before defining a solution, I really want to properly understand the problem. You mention that
but what exactly is the issue? @PatStLouis mentions the fact that all the core properties are repeated on the dcc extended entity, but I'm not sure that (alone) is itself causing the doc to be invalid (identical redefinitions are not invalid), or that it's the only source of confusion (not only are the properties repeated, but many whole untp-core entities are redefined identically in the dcc context as well). Possible non-issue: identical redefinitions of propertiesUsing the example above, if we look at the "ConformityAttestation": {
"@protected": true,
"@id": "untp-dcc:ConformityAttestation",
"@context": {
"assessorLevel": "untp-core:assessorLevel",
"assessmentLevel": "untp-core:assessmentLevel",
"attestationType": "untp-core:attestationType",
"issuedToParty": {
"@id": "untp-dcc:issuedToParty",
"@type": "@id"
},
"authorisation": "untp-core:authorisation",
"conformityCertificate": "untp-core:conformityCertificate",
"auditableEvidence": "untp-core:auditableEvidence",
"scope": {
"@id": "untp-dcc:scope",
"@type": "@id"
},
"assessment": {
"@id": "untp-dcc:assessment",
"@type": "@id"
}
}
}, we see that it is mostly redefining all the
So although it may not be best practise, that part of the output may not be an issue even if it is not ideal (though testing with the tools you mentioned will verify, I guess). Confusing non-issue: identical redefinitions of whole entitiesAnother oddity (to my unfamiliar context) is that many of the actual core entities are redefined (identically) in the DCC LD (and probably all the others too, haven't checked). If you search through the 0.5.0 dcc LD you'll find the core Even though this may not be an issue (see identical redefinition above), I think it is indirectly causing the actual issues, such as... Issue: non-identical redefinitionWhat is an issue is that the Do we know why it is redefined differently here ? Looking at the data model in jargon makes me wonder whether it's either
Similarly, I'm not sure why What is, I suspect, an issue though we may not have seen it yet, is that Anyway, once the issue is understood more deeply (not sure that I have yet), I think it's easier to focus just on what needs to change to be valid. If it were me, I'd play with the tools you've suggested and test out the above to see exactly what is valid and invalid, but importantly, even if an automated test finds the jargon-generated updated artifacts valid, the workflow should be producing a diff of changes for a human to verify the changes are the expected ones, rather than publishing without a human looking verifying it's not potentially including unnecessary redefinitions or similar that may still be valid but making it harder to use. |
Regarding the invalid redefinition of As a suggestion (only), I've created a jargon suggestion which updates the properties in the DCC that refer to the sub-classed https://jargon.sh/user/unece/ConformityCredential/suggestion/1/?view=changes |
Currently, we're encountering issues where invalid artifacts (JSON-LD context files) are released from Jargon without proper validation. These issues include conflicting protected terms that are only discovered after release. This leads to implementation problems downstream and requires additional releases to fix them. To prevent these problems and improve the quality of our releases, we need a systematic approach to catching these validation errors before artifacts are released.
An example of this can be found here, where we are redefining protected terms within the Digital Conformity Credential.
Thank you @PatStLouis for picking up on this.
After a few conversations, I've been investigating ways to validate the artifacts produced by Jargon (JSON-LD context files, schemas, and sample credentials) and integrate this validation into our prerelease workflow. This proposal aims to prevent the release of invalid artifacts by implementing checks before release.
Based on my initial analysis, we can catch most of the errors by implementing a GitHub Actions pipeline that executes a series of validation test cases using libraries like Ajv and jsonld.js. This pipeline would be triggered on a snapshot within Jargon via a GitHub webhook, running validation checks against the generated artifacts.
Proposed Artifact Testing Pipeline:
What would be tested:
Context
Schema
Sample Credential
To catch current errors, particularly redefined protected terms, in this case, testing the context file alone is insufficient. These errors often emerge when multiple types are assigned and those types have conflicting terms (see @PatStLouis example above). Therefore, we need to test the sample credentials produced by Jargon alongside the context files it generates.
Additionally, we need to ensure that implementers can extend a given UNTP data model while maintaining conformance to the core specification. Based on initial analysis, automating this validation is challenging without prior knowledge of the credentials and updated test cases. For now, this validation will likely need to be a manual pre-release task. However, I welcome suggestions for potential automation approaches.
Moving forward, we also plan to incorporate many of these test cases into the UNTP Playground, where implementers can validate their credentials upon upload.
The text was updated successfully, but these errors were encountered: