Add type instructions and encodings as a separate schema #686

AFOliveira · 2025-04-29T08:20:30Z

Add instruction “type” schema & link sub‐types (closes #676)

inst_schema.json:
- Introduce type object under encoding with a $ref to external inst_type/...#.yaml files
inst_type_schema.json:
- Import of encoding schema for defining RV32/RV64 encodings for instruction types
Lays groundwork for sub‐types in R/I/B/S/U formats as discussed in Add sub-types to existing instruction formats #655

…ific encoding instead of pre-defining an encoding in the yaml file. Signed-off-by: Afonso Oliveira <[email protected]>

…fic encoding instead of pre-defining an encoding in the instruction yaml file. Signed-off-by: Afonso Oliveira <[email protected]>

Signed-off-by: Afonso Oliveira <[email protected]>

dhower-qc

Do we also need sub_type?

schemas/inst_type_schema.json

AFOliveira · 2025-04-29T13:31:34Z

Do we also need sub_type?

I though you commented against sub_type in #655 (comment). However, now I see @ThinkOpenly proposed format and sub-type. However, with the URI path of /type/sub-type, isn't it redundant to have both?

ThinkOpenly

General question: is the purpose of this PR to define a schema just for subtypes, or for both instruction types and subtypes?

I think we need to distinctly define the top-level types that are well-defined in the spec, and additionally define subtypes (that aren't well-defined in the spec).

ThinkOpenly · 2025-04-29T13:42:42Z

schemas/inst_schema.json

+        "$ref": {
+          "type": "string",
+          "format": "uri-reference",
+          "pattern": "^inst_type/[A-Z]/[a-zA-Z0-9_]+\\.yaml#.*$",


The "pattern" won't accommodate "R4-type" as a top-level format, and we could consider a more restrictive naming scheme for the "$ref", such as "at least starts with a capital letter":

"pattern": "^inst_type/[A-Z]+[a-zA-Z0-9_]*]/[A-Z]+[a-zA-Z0-9_]*\\.yaml#.*$",

(But, we're inventing that convention.)

This is the last thing to do, do we really want it to start with inst_type? where will this files stay within UDB?

ThinkOpenly · 2025-04-29T13:50:20Z

schemas/inst_schema.json

@@ -190,6 +203,25 @@
          "$ref": "schema_defs.json#/$defs/requires_entry",
          "description": "Extension(s) that defines the instruction"
        },
+        "type": {


Continuing my type/subtype thought, this field would be for the top-level "type", and we'd add an additional field, "subtype", which defines the exact type of the instruction encoding.

Then type wouldn't be any reference, but just a string? (i.e. "I")

I think it would be a reference to generic definitions of the top-level instruction formats.

Do we need to explicitly mention "type", given that subtype presumably points to its parent?

I may have resolved this too early. It would be convenient to have the top-level type here. One-stop shopping for anyone looking at the instruction, rather than having to look at every subtype to find the type and make sure they all agree. We could add validation that everything matches, too.

There currently isn't a pointer from subtype to parent. That's probably a good add, and offers the ability to do additional validation.

It seems to me that having a parent in the sub-type if we did not describe the main type in the instructions, otherwise how can it not be redundant information?

It seems to me that having a parent in the sub-type if we did not describe the main type in the instructions, otherwise how can it not be redundant information?

I'm not following. I propose having some indication of the top-level type in the instruction and in each subtype. (They do not necessarily need to be the same thing, although they could be.) In this way, both have "full information" at hand to facilitate validation, rather than needing an additional level of indirection in order to determine the top-level format.

Is that redundant? Yes, but it allows each to be standalone (for some definition of that term).

If you remove the top-level type from the instruction, you'd need to look at every subtype (usually one, but sometimes 2 [RV32, RV64]) to find the top-level type, and then make sure they all agree.

If you remove the top-level type from the subtype, you'd need to go find all of the instructions that use the subtype to find the top-level type, and then make sure they all agree.

But isn't what you are proposing precisely the opposite of what was decided on the definedBy field? In which we removed the B extension that implied Zbx extensions? I notice that the relationship between extensions and types is not the same, but they close enough to me in this terms. I don't have any strong opinion on this, it just seems counter-intuitive and against the previous line of thought

I see definedBy differently. That attribute a fairly static description of precisely which extension(s) actually define the instruction, rather than just include it -- which would be somewhat dynamic. It would be an ongoing maintenance effort (nightmare?) as new umbrella extensions are added to go through the entire extensions hierarchy and add that new extension to the definedBy for newly included instructions. I was trying to avoid that error-prone update requirement.

For types/subtypes, that is absolutely static. So, the convenience/cost ratio of any redundancy is pretty high. I described the pros/cons above.

Is your preference to only define the top-level type in subtypes, and not in the instruction itself?

I agree with that distinction between definedBy and type — the comparison makes total sense. I also don’t think the choice is hard to reverse, so let’s go with your approach for now, and if it makes more sense to switch later, that’s totally fine.

….json since now it is common to multiple files Signed-off-by: Afonso Oliveira <[email protected]>

…l parameter of inst_data Signed-off-by: Afonso Oliveira <[email protected]>

ThinkOpenly

I suggest we move this along. Seems like an awesome start.

ThinkOpenly · 2025-05-02T14:06:32Z

I suggest we move this along. Seems like an awesome start.

After "field" moves from "inst_schema" to "schema_defs", of course.

AFOliveira · 2025-05-02T14:19:20Z

I know I am delaying @Unique-Usman work, I'm just super busy, I'll try to either fix this today or monday

dhower-qc · 2025-05-02T14:41:58Z

@AFOliveira, if you OK with I can finish this one up.

AFOliveira · 2025-05-02T14:43:09Z

@AFOliveira, if you OK with I can finish this one up.

Be my guest :) Thanks!

…Type

dhower-qc · 2025-05-02T20:15:26Z

I still need to update the Ruby classes, but I pushed the schema updates for review.

ThinkOpenly · 2025-05-05T12:14:34Z

arch/inst/Zba/add.uw.yaml

+format:
+  type: { "$ref": "inst_type/R.yaml#" }
+  subtype: { "$ref": "inst_subtype/R/R-x.yaml#" }
+  encoding: 0000100----------000-----0111011


Do we want to keep this? This might be a longer discussion.
Is there a way to parameterize the type and/or subtype such that the "opcodes" that are known to the instruction can be associated with the opcode fields in the type/subtype, rather than by extraction from this "encoding" bit string? Put differently, can we put here "this is an R-type, R-x-subtype instruction where funct7=0b0000100, funct3=0b000, opcode=0b0111011"?

And eventually, perhaps: "funct7=ADD.UW, funct3=ADD.UW, opcode=OP-32", or something like that, to match the spec?

That's a good point. Now that we are enumerating all the opcodes, we can assign them individually. One question is whether we want to go the $inherits route or keep it as a $ref for the subtype. $inherits is probably a better human description, and lets us use JSON Schema validation on the resolved file. The downside is that we are only using $inherits right now for profiles, so this would make that custom-to-UDB operator pervasive, which some might not like.

here's what it could look like with $inherits:

format: $inherits: "inst_subtype/R/R-x.yaml#" opcodes: funct7: display_name: ADD.UW value: 0b0000100 funct3: display_name: ADD.UW value: 0b000 opcode: display_name: OP-32 value: 0b0111011

Significantly better.

However, for at least the "opcode" values, these are shared among potentially many instructions. Should we / can we create a database for them? I guess that would be realized with another $ref?

opcode: { "$ref": "opcodes/OP-32.yaml#" }

We support multiple inheritance, so it could look like this:

format: $inherits: - inst_subtype/R/R-x.yaml# - inst_opcodes/OP-32.yaml# opcodes: funct7: display_name: ADD.UW value: 0b0000100 funct3: display_name: ADD.UW value: 0b000

arch/inst_type/R.yaml

schemas/inst_var_type_schema.json

ThinkOpenly · 2025-05-06T16:57:44Z

arch/inst_opcode/funct3.yaml

+data:
+  location: 14-12


Unfortuntely, "funct3" is not always at this location. It depends on the format. For example, CI, CSS, CIW, CL, CS all have a "funct3" at 15-13.

Would you prefer we just drop the indirection or call it something like R-funct3?

What do you think about something like this?

arch/inst_type/R.yaml (unchanged)

arch/inst_subtype/R-x.yaml:
data: type: { "$ref": "inst_type/R.yaml#" } subtype: { "$ref": "inst_subtype/R/R-x.yaml#" } $inherits: - inst_type/R.yaml#/opcodes - inst_var/xs2.yaml#/data - inst_var/xs1.yaml#/data - inst_var/xd.yaml#/data

arch/inst/Zba/add.uw.yaml:
format: $inherits: inst_subtype/R/R-x.yaml# opcodes: funct7: display_name: ADD.UW value: 0b0000100 funct3: display_name: ADD.UW value: 0b000 opcode: { $inherits: inst_opcode/OP-32.yaml#/data }

arch/inst_var/xs2.yaml:
data: $inherits: inst_type/R.yaml#/variables/rs2 type: { "$ref": "inst_var_type/x_src_reg.yaml#" }

etc.

So, R-x subtype inherits opcodes from the base type, R.
And xs2 inherits attributes (just "location") from the base type R's rs2.
(I removed the top-level "inherits: OP-32" from "add.uw.yaml". Was that needed?)

AFOliveira added 2 commits April 28, 2025 11:20

feat(type):Add type to instruction schema. Allows referencing to spec…

4f102d1

…ific encoding instead of pre-defining an encoding in the yaml file. Signed-off-by: Afonso Oliveira <[email protected]>

feat(type):Add instruction_type schema. Source of references to speci…

1fda012

…fic encoding instead of pre-defining an encoding in the instruction yaml file. Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira requested review from dhower-qc and ThinkOpenly as code owners April 29, 2025 08:20

AFOliveira added 2 commits April 29, 2025 09:24

Merge branch 'main' into AFOliveira/jsonSchemaType

0e9d619

Fix pre-commit error on json schemas.

2209122

Signed-off-by: Afonso Oliveira <[email protected]>

dhower-qc requested changes Apr 29, 2025

View reviewed changes

schemas/inst_type_schema.json Outdated Show resolved Hide resolved

ThinkOpenly requested changes Apr 29, 2025

View reviewed changes

AFOliveira added 2 commits April 29, 2025 19:18

refactor(encoding schema): move encoding schema to shared schema_defs…

fc0d5b4

….json since now it is common to multiple files Signed-off-by: Afonso Oliveira <[email protected]>

feat(schemas) : add subtype to the inst_schema and make it an optiona…

b98ba93

…l parameter of inst_data Signed-off-by: Afonso Oliveira <[email protected]>

ThinkOpenly approved these changes May 2, 2025

View reviewed changes

dhower-qc added 2 commits May 2, 2025 13:12

Merge remote-tracking branch 'origin/main' into AFOliveira/jsonSchema…

f2310a8

…Type

feat(schema): wip on instruction type/subtype

501a67b

ThinkOpenly reviewed May 5, 2025

View reviewed changes

arch/inst_type/R.yaml Show resolved Hide resolved

ThinkOpenly reviewed May 5, 2025

View reviewed changes

schemas/inst_var_type_schema.json Outdated Show resolved Hide resolved

dhower-qc added 3 commits May 5, 2025 06:54

feat(ruby): understand instruction types/subtypes in Ruby

b7aa7e8

refactor(schema): change 'direction' to 'access' in inst var type

d43e0f0

refactor(schema): changing to inheritance for instruction types

bc2bf84

ThinkOpenly reviewed May 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add type instructions and encodings as a separate schema #686

Add type instructions and encodings as a separate schema #686

AFOliveira commented Apr 29, 2025

dhower-qc left a comment

AFOliveira commented Apr 29, 2025

ThinkOpenly left a comment

ThinkOpenly Apr 29, 2025

AFOliveira Apr 29, 2025

ThinkOpenly Apr 29, 2025

AFOliveira Apr 29, 2025

ThinkOpenly Apr 29, 2025

dhower-qc Apr 29, 2025

ThinkOpenly May 2, 2025

AFOliveira May 2, 2025

ThinkOpenly May 2, 2025 •

edited

Loading

AFOliveira May 2, 2025

ThinkOpenly May 2, 2025

AFOliveira May 2, 2025

ThinkOpenly left a comment

ThinkOpenly commented May 2, 2025

AFOliveira commented May 2, 2025

dhower-qc commented May 2, 2025

AFOliveira commented May 2, 2025

dhower-qc commented May 2, 2025

ThinkOpenly May 5, 2025

dhower-qc May 5, 2025

dhower-qc May 5, 2025

ThinkOpenly May 5, 2025

dhower-qc May 5, 2025

ThinkOpenly May 6, 2025

dhower-qc May 6, 2025

ThinkOpenly May 6, 2025

Add type instructions and encodings as a separate schema #686

Are you sure you want to change the base?

Add type instructions and encodings as a separate schema #686

Conversation

AFOliveira commented Apr 29, 2025

Add instruction “type” schema & link sub‐types (closes #676)

dhower-qc left a comment

Choose a reason for hiding this comment

AFOliveira commented Apr 29, 2025

ThinkOpenly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThinkOpenly May 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThinkOpenly left a comment

Choose a reason for hiding this comment

ThinkOpenly commented May 2, 2025

AFOliveira commented May 2, 2025

dhower-qc commented May 2, 2025

AFOliveira commented May 2, 2025

dhower-qc commented May 2, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThinkOpenly May 2, 2025 •

edited

Loading