-
Notifications
You must be signed in to change notification settings - Fork 40
Add type instructions and encodings as a separate schema #686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ific encoding instead of pre-defining an encoding in the yaml file. Signed-off-by: Afonso Oliveira <[email protected]>
…fic encoding instead of pre-defining an encoding in the instruction yaml file. Signed-off-by: Afonso Oliveira <[email protected]>
Signed-off-by: Afonso Oliveira <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need sub_type?
I though you commented against sub_type in #655 (comment). However, now I see @ThinkOpenly proposed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question: is the purpose of this PR to define a schema just for subtypes, or for both instruction types and subtypes?
I think we need to distinctly define the top-level types that are well-defined in the spec, and additionally define subtypes (that aren't well-defined in the spec).
schemas/inst_schema.json
Outdated
"$ref": { | ||
"type": "string", | ||
"format": "uri-reference", | ||
"pattern": "^inst_type/[A-Z]/[a-zA-Z0-9_]+\\.yaml#.*$", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "pattern" won't accommodate "R4-type" as a top-level format, and we could consider a more restrictive naming scheme for the "$ref", such as "at least starts with a capital letter":
"pattern": "^inst_type/[A-Z]+[a-zA-Z0-9_]*]/[A-Z]+[a-zA-Z0-9_]*\\.yaml#.*$",
(But, we're inventing that convention.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the last thing to do, do we really want it to start with inst_type? where will this files stay within UDB?
schemas/inst_schema.json
Outdated
@@ -190,6 +203,25 @@ | |||
"$ref": "schema_defs.json#/$defs/requires_entry", | |||
"description": "Extension(s) that defines the instruction" | |||
}, | |||
"type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing my type/subtype thought, this field would be for the top-level "type", and we'd add an additional field, "subtype", which defines the exact type of the instruction encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then type wouldn't be any reference, but just a string? (i.e. "I")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a reference to generic definitions of the top-level instruction formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to explicitly mention "type", given that subtype presumably points to its parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have resolved this too early. It would be convenient to have the top-level type here. One-stop shopping for anyone looking at the instruction, rather than having to look at every subtype to find the type and make sure they all agree. We could add validation that everything matches, too.
There currently isn't a pointer from subtype to parent. That's probably a good add, and offers the ability to do additional validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that having a parent in the sub-type if we did not describe the main type in the instructions, otherwise how can it not be redundant information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that having a parent in the sub-type if we did not describe the main type in the instructions, otherwise how can it not be redundant information?
I'm not following. I propose having some indication of the top-level type in the instruction and in each subtype. (They do not necessarily need to be the same thing, although they could be.) In this way, both have "full information" at hand to facilitate validation, rather than needing an additional level of indirection in order to determine the top-level format.
Is that redundant? Yes, but it allows each to be standalone (for some definition of that term).
If you remove the top-level type from the instruction, you'd need to look at every subtype (usually one, but sometimes 2 [RV32, RV64]) to find the top-level type, and then make sure they all agree.
If you remove the top-level type from the subtype, you'd need to go find all of the instructions that use the subtype to find the top-level type, and then make sure they all agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't what you are proposing precisely the opposite of what was decided on the definedBy
field? In which we removed the B extension that implied Zbx
extensions? I notice that the relationship between extensions and types is not the same, but they close enough to me in this terms. I don't have any strong opinion on this, it just seems counter-intuitive and against the previous line of thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see definedBy
differently. That attribute a fairly static description of precisely which extension(s) actually define the instruction, rather than just include it -- which would be somewhat dynamic. It would be an ongoing maintenance effort (nightmare?) as new umbrella extensions are added to go through the entire extensions hierarchy and add that new extension to the definedBy
for newly included instructions. I was trying to avoid that error-prone update requirement.
For types/subtypes, that is absolutely static. So, the convenience/cost ratio of any redundancy is pretty high. I described the pros/cons above.
Is your preference to only define the top-level type in subtypes, and not in the instruction itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with that distinction between definedBy
and type
— the comparison makes total sense. I also don’t think the choice is hard to reverse, so let’s go with your approach for now, and if it makes more sense to switch later, that’s totally fine.
….json since now it is common to multiple files Signed-off-by: Afonso Oliveira <[email protected]>
…l parameter of inst_data Signed-off-by: Afonso Oliveira <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we move this along. Seems like an awesome start.
After "field" moves from "inst_schema" to "schema_defs", of course. |
I know I am delaying @Unique-Usman work, I'm just super busy, I'll try to either fix this today or monday |
@AFOliveira, if you OK with I can finish this one up. |
Be my guest :) Thanks! |
I still need to update the Ruby classes, but I pushed the schema updates for review. |
arch/inst/Zba/add.uw.yaml
Outdated
format: | ||
type: { "$ref": "inst_type/R.yaml#" } | ||
subtype: { "$ref": "inst_subtype/R/R-x.yaml#" } | ||
encoding: 0000100----------000-----0111011 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep this? This might be a longer discussion.
Is there a way to parameterize the type and/or subtype such that the "opcodes" that are known to the instruction can be associated with the opcode fields in the type/subtype, rather than by extraction from this "encoding" bit string? Put differently, can we put here "this is an R-type, R-x-subtype instruction where funct7=0b0000100, funct3=0b000, opcode=0b0111011"?
And eventually, perhaps: "funct7=ADD.UW, funct3=ADD.UW, opcode=OP-32", or something like that, to match the spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Now that we are enumerating all the opcodes, we can assign them individually. One question is whether we want to go the $inherits
route or keep it as a $ref
for the subtype. $inherits
is probably a better human description, and lets us use JSON Schema validation on the resolved file. The downside is that we are only using $inherits
right now for profiles, so this would make that custom-to-UDB operator pervasive, which some might not like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's what it could look like with $inherits
:
format:
$inherits: "inst_subtype/R/R-x.yaml#"
opcodes:
funct7:
display_name: ADD.UW
value: 0b0000100
funct3:
display_name: ADD.UW
value: 0b000
opcode:
display_name: OP-32
value: 0b0111011
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significantly better.
However, for at least the "opcode" values, these are shared among potentially many instructions. Should we / can we create a database for them? I guess that would be realized with another $ref
?
opcode: { "$ref": "opcodes/OP-32.yaml#" }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support multiple inheritance, so it could look like this:
format:
$inherits:
- inst_subtype/R/R-x.yaml#
- inst_opcodes/OP-32.yaml#
opcodes:
funct7:
display_name: ADD.UW
value: 0b0000100
funct3:
display_name: ADD.UW
value: 0b000
data: | ||
location: 14-12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortuntely, "funct3" is not always at this location. It depends on the format. For example, CI, CSS, CIW, CL, CS all have a "funct3" at 15-13.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefer we just drop the indirection or call it something like R-funct3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about something like this?
arch/inst_type/R.yaml
(unchanged)arch/inst_subtype/R-x.yaml
:data: type: { "$ref": "inst_type/R.yaml#" } subtype: { "$ref": "inst_subtype/R/R-x.yaml#" } $inherits: - inst_type/R.yaml#/opcodes - inst_var/xs2.yaml#/data - inst_var/xs1.yaml#/data - inst_var/xd.yaml#/data
arch/inst/Zba/add.uw.yaml
:format: $inherits: inst_subtype/R/R-x.yaml# opcodes: funct7: display_name: ADD.UW value: 0b0000100 funct3: display_name: ADD.UW value: 0b000 opcode: { $inherits: inst_opcode/OP-32.yaml#/data }
- arch/inst_var/xs2.yaml:
data: $inherits: inst_type/R.yaml#/variables/rs2 type: { "$ref": "inst_var_type/x_src_reg.yaml#" }
etc.
So, R-x subtype inherits opcodes from the base type, R.
And xs2
inherits attributes (just "location") from the base type R's rs2
.
(I removed the top-level "inherits: OP-32" from "add.uw.yaml". Was that needed?)
Add instruction “type” schema & link sub‐types (closes #676)
type
object underencoding
with a$ref
to externalinst_type/...#.yaml
files