-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new query language (Approach 1) #220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clear to me that this is an improvement over presentation exchange.
are presentation submissions from PE still being used? just a different presentation definition format defined here?
I would recommend not merging this as is. My recommendation is to create a new dependent spec that fully defines a query language.
] | ||
}, | ||
"credentials": { | ||
"pid": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would someone specify claims from different formats?
if that is allowed, we should have an example.
if not allowed, then it seems the format
property can be extracted to a higher level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, an example should be added. The other formats would just be separate credential request objects.
@@ -542,6 +545,266 @@ The Wallet then validates the request as specified in OAuth 2.0 [@RFC6749]. | |||
|
|||
If the Verifier responds with any HTTP error response, the Wallet MUST terminate the process. | |||
|
|||
# Verifiable Presentation Query Language {#vp_query} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see this as a separate specification. It can be fully tested outside of OID4VP flows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think the intention of this new query language has been to be protocol independent, so not sure what would be the benefit of "fully testing outside of OID4VP flows" unless I am misinterpreting it..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought is that a software implementation of a query language is distinct from an implementation for OID4VP. It could have its own conformance suite and open source tooling, and potentially, utility outside of this specification.
If those broader ambitions are shared amongst the group then breaking it out could be the right path.
optionality rule. The entries are listed in order of preference, with the | ||
first entry being the most preferred. | ||
|
||
More than two nested levels of optionality rules MUST NOT be used by a Verifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if credentials have >2 layers of nesting what's the solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for any depth of nesting; the optionality rules are just capped to enable easier processing, but this doesn't reduce expressiveness of the language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "two nested levels of optionality rules" mean that there is a total of 2 or 3 instances of "required" in a single "claim" element?
|
||
### JSON-based Claims Structures {#json_based_claims} | ||
|
||
For credentials containing JSON-based claims structures, the following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should clearly define what credential formats use this section rather than saying credentials using JSON-based claims structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer using path for mdocs too to solve this weird deviation between the formats for this top-level parameter layer
A purpose description array is an object that describes the purpose for which | ||
the Verifier is requesting the Verifiable Credential. It can contain multiple | ||
objects, each describing the purpose in a specific language. Each object | ||
contains the following properties: | ||
|
||
`lang:` | ||
: REQUIRED. A string that specifies the language of the purpose | ||
description. The value of this property MUST be a valid language tag as defined | ||
in [@!BCP47]. | ||
|
||
`text`: | ||
: REQUIRED. A string that contains the purpose description in the | ||
language specified by the `lang` property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, free text is not a good idea for a lot of reasons, e.g., disambiguation between different translations of the the same reason and actually providing different reasons -> prone to errors; privacy; i18n/l18n which is hard to scale with this approach. A more appropriate approach is to define reason codes. There was a lot of work in other standards orgs on that, e.g., Kantara.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with the comment on not having free text because of the UI/UX and privacy concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please propose a change introducing reason codes so that we can discuss on the calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to provide reason codes that can help users guide in how to respond in a meaningful way, we need to find discuss the proper places where to define such reason codes. These reason codes are likely to be use case and/or document type specific. Therefore one approach could be to define reason codes in such a way that they can be scoped/namespaced according to the entity defining them.
Separate to that we should also decide whether we can define or reference a set of such codes that could apply more generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit that I lack the imagination how a set of codes could ever be complete. "Complete" here meaning in particular that I can use them to refer to all possible reasons for obtaining data through a wallet. Just as a few examples:
- "Required for account opening according to [law XYZ]" — we cannot enumerate all possible laws; also many laws and their applicability will be dependent on the region.
- "Age verification according to [regulatory body XYZ]"
- "Required only for obtaining [product XYZ]"
- "Alternative when not in possession of [credential XYZ]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it's difficult / unlikely to have reasons codes that are both very specific and can address every possible situation. However, allowing free text is not a solution to achieve that as it introduces privacy and security problems that now need to be addressed. it's a lot more difficult for the wallet to meaningfully use a free form text field to help guide the user compared to pre-defined reason codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, providing a meaningful purpose to a user requires customization of the respective text to the particular RP and process, even the specific transaction in some cases. Yes, free text introduces security risks, but those are not unsolvable, developers cope with them every day when building applications with dynamic input. The multi language aspect is a true challenge, it would most likely require the RP sending the purpose string in multiple languages.
Pre-defined reasons code on the other hand posse their own challenges:
- There needs to be registries for those codes. Which parties operate these registries?
- What does a RP need to do in order to be able to send the first presentation request if this RP wants to implement a use case no one has implemented/standardized before? I think requiring pre-defined reason codes would create a huge administrative burden to RPs and delay deployment of new integrations in an inacceptable manner.
- How are those code distributed along with the respective text? What is the update policy?
- In the simplest case, there is no link possible to an actual transaction performed by a RP.
: OPTIONAL. A purpose description array as defined in | ||
(#purpose_description). | ||
|
||
`claims`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether it is a good idea to have claims
being a polymorphic object with a format-dependent object type, or if it would be better to have everything nested inside a format identifier object, e.g., sd-jwt-vc: { claims ... }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not put it into the format-specific part, as claims
applies to either all JSON-based structures (which can be multiple formats) or all ISO-based structures (mdoc). We could rename to claims
and claims_iso
maybe.
Co-authored-by: Ted Thibodeau Jr <[email protected]> Co-authored-by: Oliver Terbu <[email protected]> Co-authored-by: Christian Bormann <[email protected]>
This reverts commit 7d71960.
@@ -0,0 +1,43 @@ | |||
{ | |||
"rules": { | |||
"required": 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed on editors' call: add mapping to "and" and "or"
|
||
## Credential Rules {#credential_rules} | ||
|
||
Just as for claims, the Verifier can define rules that specify which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using the PE submission_requirements feature. I see the intent here is to replace submission_requirements
with rules
. The features that we in HR use cases that are missing from rules are
- ability to require
all
credentials of a particular credential type (eg. all transcripts) vs one of a particular type (eg mDL). From my understanding both would expressed as{ "required": 1, "from": [{"ref": "transcript"}]}
and{ "required": 1, "from": [{"ref": "mDL"}]}
- in HR use cases the user has more flexibility in what to share. therefore the ability to also require at least one is useful
{"min": 1, "from": [{"ref": "email"}]}
. - explaining the purpose of a
rule
(e.g. the RP requires a mDL or national id card to get the applicants name) - should there be a maximum depth to rules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ability to require all credentials of a particular credential type (eg. all transcripts) vs one of a particular type (eg mDL). From my understanding both would expressed as { "required": 1, "from": [{"ref": "transcript"}]} and { "required": 1, "from": [{"ref": "mDL"}]}
in HR use cases the user has more flexibility in what to share. therefore the ability to also require at least one is useful {"min": 1, "from": [{"ref": "email"}]}.
As per a simiilar suggestion above required
should be all
, optional
, one
, at_least_one
with the default either being all
or optional
. An alternative if a number is required for certain use cases would be to replace one
with count
which would then have an additional property with the expected count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The at_least_one
case is not currently captured, but we should consider introducing it and therefore moving back to keywords instead of numbers. Do you have a concrete use case in mind for at_least_one
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would a purpose for the rule be exposed to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielfett the current use case we have is in the application for a job. In the recruitment use case the holder can share whatever information they feel is relevant. For example, a person may have had studied and achieved a number of diplomas, but one or more maybe irrelevant and/or perhaps detrimental to them being successful in the application. The recruiter therefore will request at least one degree but leave it up to the holder to select the relevant one or two.
Again the recruitment use case means that certain types of credentials are needed for a purpose. We have a UI in the reference wallet in the HR use case that takes the user through a wizard-like process to meet each rule requirement. A purpose for that rule is required for this UX to make sense
A valid VP Query is defined as a JSON-encoded object with the following | ||
properties: | ||
|
||
`credentials`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is mentioned that the key of the credential object is a string. Since these strings are not meant for human consumption, it's cleaner and easier to use integers for these since no unintended meaning can de derived from these strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it is pretty common to use strings even when they are not meant for human consumption. it has also been a convention throughout openid4vc specs: credential_configuration_id and credential_identifier in VCI just being some of the examples. I do not see why deviate from that here and would prefer to keep it a string.
Appendix A of [@!OpenID.VCI]. The value of this property MUST be one of the | ||
supported Credential formats as defined in the Wallet's metadata. | ||
|
||
Format-specific properties (using the format value as a key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This states that the format-specific properties use the format value as key, but in the #format_specific_properties section, keys are defined that don't have the format value as a key.
- Claim query: An object describing a single claim that is to be returned, as | ||
specified in (#claim_query). | ||
- An optionality rule: An object containing the following properties: | ||
- `required` (REQUIRED): defines a number of claims that must be returned from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there are any use cases where required is something else than either "1" or "all". Allowing other options adds complexity because it means more testing and more edge cases.
Propose to change the structure to only allow 1 or all.
- An optionality rule: An object containing the following properties: | ||
- `required` (REQUIRED): defines a number of claims that must be returned from | ||
the `from` set of claims, as specified next. | ||
- `from` (REQUIRED): An array; each entry is either a claim query or another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The content of the 'from' element is polymorphic, because it can be either a claim query or another optionality rule. That in itself can make implementations a bit more error-prone, since they need to check the contents of this field and check it against the possible fields to determine how it needs to process it. But in this case the claim query itself is also format specific, leading to relatively complex code.
One potential solution is to make the content of the structure explicit instead of polymorphic e.g. by having separate elements for claim query or optionality rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand the proposal - could you please give an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hereby some examples on what it could look like. Option 1 is the current proposal, option 2 and 3 are alternatives.
; option (1) What we have in the current proposal
claims = [+claim] ; you can ask for multiple "claim", each "claim" MUST be satisfied
claim = claim_query / optionality_rule ; claim_query OR optionality_rule
optionality_rule = {
"required": uint, ; number of claims to be returned from the 'from' set
"from": [+claim] ; recursively contains the claim structure
}
claim_query = claim_query_json / claim_query_mdoc ; claim_query_json OR claim_query_mdoc
claim_query_json = {
"path": tstr
}
claim_query_mdoc = {
"namespace": tstr,
"claim_name": tstr
}
; option (2) For each claim, define the main element, then define alternates for that claim
claims = [+claim] ; you can ask for multiple "claim", each "claim" MUST be satisfied
claim = {
"claim_query" = claim_query, ; claim_query as defined in (1)
"alternate_claims" = [+ claim_query] ; Alternate claims, in order of preference
}
; option (3) For each claim, define an array
claims = [+claim] ; you can ask for multiple "claim", each "claim" MUST be satisfied
claim = [+claim_query] ; to satisfy the "claim" array, one element from the array (e.g. one claim_query) must be returned
Note, what is the best solution also depends on the outcome of the other discussions. Specifically the ones regarding intent_to_retain, if_present and whether to have the claim_query be polymorphic or to have different versions of the "claims" element, e.g. claims_json and claims_mdoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed yesterday, hereby some examples for these structures. Note that other solutions besides option 2 or option 3 may exist that are more appropriate, especially when we consider the other comments on this PR.
Example 3: given_name, age_over_18 OR birth_date OR age_birth_year
option(1)
claims = [
{ "path": ["given_name"] },
{
"required": 1,
"from": [
{ "path": ["age_over_18"] },
{ "path": ["birth_date"] },
{ "path": ["age_birth_year"] }
]
}
]
option(2)
claims = [
{
"claim_query": { "path": ["given_name"] }
},
{
"claim_query": { "path": ["age_over_18"] },
"alternate_claims": [
{ "path": ["birth_date"] },
{ "path": ["age_birth_year"] }
]
},
]
option(3)
claims = [
[ { "path": ["given_name"] } ]
[
{ "path": ["age_over_18"] },
{ "path": ["birth_date"] },
{ "path": ["age_birth_year"] }
]
]
Example 4: given_name, zip_code OR (city AND state)
option(1)
claims = [
{ "path": ["given_name"] },
{
"required": 2,
"from": [
{ "path": ["zip_code"] },
{
"required": 1,
"from": [
{ "path": ["city"] },
{ "path": ["state"] },
]
}
]
}
]
option(2)
claims = [
{
"claim_query": { "path": ["given_name"] }
},
{
"claim_query": { "path": ["zip_code"] },
"alternate_claims": [
{ "path": ["city"] },
]
},
{
"claim_query": { "path": ["zip_code"] },
"alternate_claims": [
{ "path": ["state"] },
]
}
]
option(3)
claims = [
[ { "path": ["given_name"] } ]
[
{ "path": ["zip_code"] },
{ "path": ["city"] },
],
[
{ "path": ["zip_code"] },
{ "path": ["state"] },
]
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ,
in the above mean AND/required? I think the description text should be more specific and informative.
Also, does the Example 4
mean this given_name AND (zip_code OR (city AND state))
seems like in option 3 zip_code OR city
and zip_code OR state
could result in zip_code and zip_code. Resulting in not returning city or state.
Can we redefine option 3 by swapping the meaning of things in the 1st and 2nd level [ ]
?
E.g. given_name AND (zip_code OR (city AND state))
redefining option 3
claims = [
[
{ "path": ["given_name"] },
{ "path": ["zip_code"] },
]
[
{ "path": ["given_name"] },
{ "path": ["state"] },
{ "path": ["city"] },
],
]
This would make it simpler as entries in the root level [ ]
would mean OR and the 2nd level will mean MUST.
In the above it translates to (given_name and zipcode) OR (given_name and zip_code and city)
Similarly for Example 3: given_name and (age_over_18 OR birth_date OR age_birth_year)
redefining option 3
claims = [
[
{ "path": ["given_name"] },
{ "path": ["age_over_18"] },
],
[
{ "path": ["given_name"] },
{ "path": ["birth_date"] },
],
[
{ "path": ["given_name"] },
{ "path": ["age_birth_year"] },
],
This also corresponds to all the potential credentials User can consent to send. So it won't result in cases where zip_code and zip_code
are sent.
I recognize that this could cause the query to be longer than other methods in cases where there are a large number of OR's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more clear to use a notation (req) instead of and
as in boolean "zip_code" and "zip_code" translates to just 1 "zip_code"
https://grace.bluegrass.kctcs.edu/~kdunn0001/files/Simplification/4_Simplification_print.html
Where as in (req)zip_code (req)zip_code it is sent twice?
optionality rule. The entries are listed in order of preference, with the | ||
first entry being the most preferred. | ||
|
||
More than two nested levels of optionality rules MUST NOT be used by a Verifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are limiting these two two levels, it can be easier to not have the structure reference itself, but define the different levels explicitly. While this makes the definition slightly bigger, it results in a lot less edge cases that we need to consider.
|
||
### Format `mso_mdoc` {#format_mso_mdoc} | ||
|
||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things that need to be added.
- IssuerIdentifiers (using the AKI of one of the certificates in the MSO header)
- Conditional elements
- Intent to retain
Which of these elements should go in the format specific section and which of these in the general section? Once agreed on that, I can provide exact definitions of these elements.
A purpose description array is an object that describes the purpose for which | ||
the Verifier is requesting the Verifiable Credential. It can contain multiple | ||
objects, each describing the purpose in a specific language. Each object | ||
contains the following properties: | ||
|
||
`lang:` | ||
: REQUIRED. A string that specifies the language of the purpose | ||
description. The value of this property MUST be a valid language tag as defined | ||
in [@!BCP47]. | ||
|
||
`text`: | ||
: REQUIRED. A string that contains the purpose description in the | ||
language specified by the `lang` property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with the comment on not having free text because of the UI/UX and privacy concerns.
in the top-level property `rules`. The values is a credential rule object, | ||
defined as having the following properties: | ||
|
||
- `required`: An integer specifying the number of credentials that MUST be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments made on the rules for data elements also apply to the rules for documents. Likely the same solution can apply.
- a credential query reference, i.e., an object with a single key `ref` that | ||
references a credential query defined in the `credentials` object. | ||
|
||
Within `from`, the credentials are listed in order of preference, with the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While for data elements having the elements listed in order of preference is a the most common scenario, for documents this is not the case in most scenarios. While there are situations where a document will be requested in order of preference, there are more situations where there is no such order of preference. Forcing this to be the case is not ideal.
I would propose to remove the statement about this implying an order of preference. While we could add structures that allow to indicate whether there is an order or not, the extra complexity that comes from that is likely not worth the benefits in use cases.
credentials MAY be returned by the Wallet, as defined in (#credential_rules). If | ||
omitted, all credentials listed in `credentials` are requested for presentation. | ||
|
||
## Credential Query {#credential_query} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The credential query structure is defined as a polymorphic structure that contains both elements that always have the same content (purpose), elements that are partly format specific (claims) and elements that are fully format specific (everything else).
Since there is only a single element that is not currently format specific (purpose), we should consider making the request fully profile specific to simplify implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming by profile specific, you meant format specific? I do not understand how making everything format specific would simplify things... I think the feedback we have been getting is actually in the opposite direction of removing "partly format specific" part and use path
for mdocs too, instead of namespaces
and data_elements
claims. the rest of the structure of defining the framework in a format agnostic way, while allowing format specific content for certain claims (like for the algorithms supported) makes sense to me. path vs namespace/data_elements feels pretty unnatural.
Co-authored-by: Michael B. Jones <[email protected]>
|
||
`intend_to_retain`: TBD | ||
|
||
`if_present`: TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the claim that is intended to solve the following requirement?
- When Verifier wants to express that it wants to receive a credential only when a specific requirement is met (e.g. read_id claim is present)
|
||
TBD | ||
|
||
### Format `jwt_vp*` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Format `jwt_vp*` | |
### Format `jwt_vp` |
`lang:` | ||
: REQUIRED. A string that specifies the language of the purpose | ||
description. The value of this property MUST be a valid language tag as defined | ||
in [@!BCP47]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using the identifier locale
as is done in OpenID Connect. The description there is
End-User's locale, represented as a BCP47 [RFC5646] language tag. This is typically an ISO 639 Alpha-2 [ISO639] language code in lowercase and an ISO 3166-1 Alpha-2 [ISO3166‑1] country code in uppercase, separated by a dash. For example, en-US or fr-CA. As a compatibility note, some implementations have used an underscore as the separator rather than a dash, for example, en_US
Please review this new approach: #260 |
I'll close this PR as it seems that there is consensus that #266 is the better approach. |
Fixes #178
Todo: