riscv-software-src · dhower-qc · May 9, 2025 · Apr 30, 2025 · Apr 30, 2025 · Apr 30, 2025
@@ -29,6 +29,7 @@ RUN \
     libyaml-dev \
     nodejs \
     npm \
+    parallel \
     python3 \
     python3-pip \
     python3.12-venv \

@@ -62,8 +62,10 @@ jobs:
         uses: actions/checkout@v4
       - name: singularity setup
         uses: ./.github/actions/singularity-setup
-      - name: Generate instruction appendix
-        run: ./do gen:instruction_appendix
+      - name: Generate instruction appendix asciidoc
+        run: ./do gen:instruction_appendix_adoc
+      - name: Check instruction appendix result
+        run: ./do test:instruction_appendix
   regress-cfg-manual:
     runs-on: ubuntu-latest
     env:

@@ -19,6 +19,7 @@ repos:
         stages: [pre-commit]
       - id: end-of-file-fixer
         stages: [pre-commit]
+        exclude: \.golden.adoc$
       - id: trailing-whitespace
         stages: [pre-commit]
         args: [--markdown-linebreak-ext=md]

@@ -5,40 +5,37 @@ kind: instruction
 name: wrs.nto
 long_name: Wait-on-Reservation-Set-with-No-Timeout
 description: |
-  -id: inst-wrs.nto-behaviour
-  -normative: false
-  -text: |
-    To mitigate the wasteful looping in such usages, a `wrs.nto` (WRS-with-no-timeout) instruction is provided.
-    Instead of polling for a store to a specific memory location, software registers a reservation set that
-    includes all the bytes of the memory location using the LR instruction. Then a subsequent `wrs.nto`
-    instruction would cause the hart to temporarily stall execution in a low-power state until a store
-    occurs to the reservation set or an interrupt is observed.
+  To mitigate the wasteful looping in such usages, a `wrs.nto` (WRS-with-no-timeout) instruction is provided.
+  Instead of polling for a store to a specific memory location, software registers a reservation set that
+  includes all the bytes of the memory location using the LR instruction. Then a subsequent `wrs.nto`
+  instruction would cause the hart to temporarily stall execution in a low-power state until a store
+  occurs to the reservation set or an interrupt is observed.
 
-    This instruction is not supported in a constrained LR/SC loop.
-    While stalled, an implementation is permitted to occasionally terminate the stall and complete
-    execution for any reason.
+  This instruction is not supported in a constrained LR/SC loop.
+  While stalled, an implementation is permitted to occasionally terminate the stall and complete
+  execution for any reason.
 
-    `wrs.nto` follows the rules of the WFI instruction for resuming execution
-    on a pending interrupt.
+  `wrs.nto` follows the rules of the WFI instruction for resuming execution
+  on a pending interrupt.
 
-    When the TW (Timeout Wait) bit in `mstatus` is set and `wrs.nto` is executed
-    in any privilege mode otherthan M mode, and it does not complete within an implementation-specific
-    bounded time limit, the `wrs.nto` instruction will cause an illegal instruction exception.
+  When the TW (Timeout Wait) bit in `mstatus` is set and `wrs.nto` is executed
+  in any privilege mode otherthan M mode, and it does not complete within an implementation-specific
+  bounded time limit, the `wrs.nto` instruction will cause an illegal instruction exception.
 
-    When executing in VS or VU mode, if the VTW bit is set in `hstatus`, the TW bit in `mstatus` is clear,
-    and the `wrs.nto` does not complete within an implementation-specific bounded time limit,
-    the `wrs.nto` instruction will cause a virtual instruction exception.
+  When executing in VS or VU mode, if the VTW bit is set in `hstatus`, the TW bit in `mstatus` is clear,
+  and the `wrs.nto` does not complete within an implementation-specific bounded time limit,
+  the `wrs.nto` instruction will cause a virtual instruction exception.
 
-    [Note]
-    Since `wrs.nto` can complete execution for reasons other than stores to the reservation set,
-    software will likely need a means of looping until the required stores have occurred.
+  [Note]
+  Since `wrs.nto` can complete execution for reasons other than stores to the reservation set,
+  software will likely need a means of looping until the required stores have occurred.
 
-    [Note]
-    `wrs.nto`, unlike WFI, is not specified to cause an illegal instruction exception if executed in U-mode
-    when the governing TW bit is 0. WFI is typically not expected to be used in U-mode and on many systems
-    may promptly cause an illegal instruction exception if used at U-mode.
-    Unlike WFI, `wrs.nto` is expected to be used by software in U-mode when waiting on memory but without
-    a deadline for that wait.
+  [Note]
+  `wrs.nto`, unlike WFI, is not specified to cause an illegal instruction exception if executed in U-mode
+  when the governing TW bit is 0. WFI is typically not expected to be used in U-mode and on many systems
+  may promptly cause an illegal instruction exception if used at U-mode.
+  Unlike WFI, `wrs.nto` is expected to be used by software in U-mode when waiting on memory but without
+  a deadline for that wait.
 definedBy: Zawrs
 assembly: ""
 encoding:

@@ -5,36 +5,33 @@ kind: instruction
 name: wrs.sto
 long_name: Wait-on-Reservation-Set-with-Short-Timeout
 description: |
-  -id: inst-wrs.sto-behaviour
-  -normative: false
-  -text: |
-    Instead of polling for a store to a specific memory location, software registers a
-    reservation set that includes all the bytes of the memory location using the LR instruction.
-    A subsequent `wrs.sto` instruction would cause the hart to temporarily stall execution in a
-    low-power state until a store occurs to the reservation set or an interrupt is observed.
-    Sometimes the program waiting on a memory update may also need to carry out a task at a future time
-    or otherwise place an upper bound on the wait. To support such use cases, `wrs.sto` bounds the
-    stall duration to an implementation-define short timeout such that the stall is terminated on the
-    timeout if no other conditions have occurred to terminate the stall. The program using this instruction
-    may then determine if its deadline has been reached.
-    `wrs.sto` causes the hart to temporarily stall execution in a low-power state as long as the reservation
-    set is valid and no pending interrupts, even if disabled, are observed. For `wrs.sto` the stall duration
-    is bounded by an implementation defined short timeout. These instructions are not supported in a
-    constrained LR/SC loop.
-    Hart execution may be stalled while the following conditions are all satisfied:
-    a. The reservation set is valid
-    b. If `wrs.sto`, a "short" duration since start of stall has not elapsed
-    c. No pending interrupt is observed (see the rules below)
+  Instead of polling for a store to a specific memory location, software registers a
+  reservation set that includes all the bytes of the memory location using the LR instruction.
+  A subsequent `wrs.sto` instruction would cause the hart to temporarily stall execution in a
+  low-power state until a store occurs to the reservation set or an interrupt is observed.
+  Sometimes the program waiting on a memory update may also need to carry out a task at a future time
+  or otherwise place an upper bound on the wait. To support such use cases, `wrs.sto` bounds the
+  stall duration to an implementation-define short timeout such that the stall is terminated on the
+  timeout if no other conditions have occurred to terminate the stall. The program using this instruction
+  may then determine if its deadline has been reached.
+  `wrs.sto` causes the hart to temporarily stall execution in a low-power state as long as the reservation
+  set is valid and no pending interrupts, even if disabled, are observed. For `wrs.sto` the stall duration
+  is bounded by an implementation defined short timeout. These instructions are not supported in a
+  constrained LR/SC loop.
+  Hart execution may be stalled while the following conditions are all satisfied:
+  a. The reservation set is valid
+  b. If `wrs.sto`, a "short" duration since start of stall has not elapsed
+  c. No pending interrupt is observed (see the rules below)
 
-    While stalled, an implementation is permitted to occasionally terminate the stall and complete
-    execution for any reason. `wrs.sto` follows the rules of the WFI instruction for resuming execution
-    on a pending interrupt. Since `wrs.sto` can complete execution for reasons other than stores to
-    the reservation set, software will likely need a means of looping until the required stores have occurred.
+  While stalled, an implementation is permitted to occasionally terminate the stall and complete
+  execution for any reason. `wrs.sto` follows the rules of the WFI instruction for resuming execution
+  on a pending interrupt. Since `wrs.sto` can complete execution for reasons other than stores to
+  the reservation set, software will likely need a means of looping until the required stores have occurred.
 
-    [Note]
-    The duration of a `wrs.sto` instruction's timeout may vary significantly within and among implementations.
-    In typical implementations this duration should be roughly in the range of 10 to 100 times an on-chip
-    cache miss latency or a cacheless access to main memory.
+  [Note]
+  The duration of a `wrs.sto` instruction's timeout may vary significantly within and among implementations.
+  In typical implementations this duration should be roughly in the range of 10 to 100 times an on-chip
+  cache miss latency or a cacheless access to main memory.
 definedBy: Zawrs
 assembly: ""
 encoding:

@@ -5,21 +5,18 @@ kind: instruction
 name: csrrc
 long_name: Atomic Read and Clear Bits in CSR
 description: |
-  -id: inst-csrrc-behavior
-  -normative: false
-  -text: |
-    The CSRRC (Atomic Read and Clear Bits in CSR) instruction reads the value of the CSR, zero-extends
-    the value to XLEN bits, and writes it to integer register `rd`. The initial value in integer register `rs1` is
-    treated as a bit mask that specifies bit positions to be cleared in the CSR. Any bit that is high in `rs1` will
-    cause the corresponding bit to be cleared in the CSR, if that CSR bit is writable.
+  The CSRRC (Atomic Read and Clear Bits in CSR) instruction reads the value of the CSR, zero-extends
+  the value to XLEN bits, and writes it to integer register `rd`. The initial value in integer register `rs1` is
+  treated as a bit mask that specifies bit positions to be cleared in the CSR. Any bit that is high in `rs1` will
+  cause the corresponding bit to be cleared in the CSR, if that CSR bit is writable.
 
-    For CSRRC, if `rs1=x0`, then the instruction will not write to the CSR at all, and so shall
-    not cause any of the side effects that might otherwise occur on a CSR write, nor raise illegal-
-    instruction exceptions on accesses to read-only CSRs. CSRRC always reads the addressed CSR and
-    cause any read side effects regardless of `rs1` and `rd` fields.
-    Note that if `rs1` specifies a register other than `x0`, and that register holds a zero value,
-    the instruction will not action any attendant per-field side effects, but will action any
-    side effects caused by writing to the entire CSR.
+  For CSRRC, if `rs1=x0`, then the instruction will not write to the CSR at all, and so shall
+  not cause any of the side effects that might otherwise occur on a CSR write, nor raise illegal-
+  instruction exceptions on accesses to read-only CSRs. CSRRC always reads the addressed CSR and
+  cause any read side effects regardless of `rs1` and `rd` fields.
+  Note that if `rs1` specifies a register other than `x0`, and that register holds a zero value,
+  the instruction will not action any attendant per-field side effects, but will action any
+  side effects caused by writing to the entire CSR.
 definedBy: Zicsr
 assembly: xd, csr, xs1
 encoding:

@@ -5,15 +5,12 @@ kind: instruction
 name: csrrci
 long_name: Atomic Read and Clear Bits in CSR with Immediate
 description: |
-  -id: inst-csrrci-behavior
-  -normative: false
-  -text: |
-    The CSRRCI variant is similar to CSRRC, except this updates the CSR using an XLEN-bit value obtained
-    by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the `rs1` field instead of a
-    value from an integer register. For CSRRCI, if the `uimm[4:0]` field is zero, then this instruction
-    will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on
-    a CSR write, nor raise illegal-instruction exceptions on accesses to read-only CSRs. The CSRRCI will
-    always read the CSR and cause any read side effects regardless of `rd` and `rs1` fields.
+  The CSRRCI variant is similar to CSRRC, except this updates the CSR using an XLEN-bit value obtained
+  by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the `rs1` field instead of a
+  value from an integer register. For CSRRCI, if the `uimm[4:0]` field is zero, then this instruction
+  will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on
+  a CSR write, nor raise illegal-instruction exceptions on accesses to read-only CSRs. The CSRRCI will
+  always read the CSR and cause any read side effects regardless of `rd` and `rs1` fields.
 definedBy: Zicsr
 assembly: xd, csr, uimm
 encoding:

@@ -5,15 +5,12 @@ kind: instruction
 name: csrrsi
 long_name: Atomic Read and Set Bits in CSR with Immediate
 description: |
-  -id: inst-csrrsi-behavior
-  -normative: false
-  -text: |
-    The CSRRSI variant is similar to CSRRS, except this updates the CSR using an XLEN-bit value obtained
-    by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the `rs1` field instead of a
-    value from an integer register. For CSRRSI, if the `uimm[4:0]` field is zero, then this instruction
-    will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on
-    a CSR write, nor raise illegal-instruction exceptions on accesses to read-only CSRs. The CSRRSI will
-    always read the CSR and cause any read side effects regardless of `rd` and `rs1` fields.
+  The CSRRSI variant is similar to CSRRS, except this updates the CSR using an XLEN-bit value obtained
+  by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the `rs1` field instead of a
+  value from an integer register. For CSRRSI, if the `uimm[4:0]` field is zero, then this instruction
+  will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on
+  a CSR write, nor raise illegal-instruction exceptions on accesses to read-only CSRs. The CSRRSI will
+  always read the CSR and cause any read side effects regardless of `rd` and `rs1` fields.
 definedBy: Zicsr
 assembly: xd, csr, uimm
 encoding:

@@ -4,14 +4,16 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64ks1i
 long_name: AES Key Schedule Instruction 1
-description: |
-  -id: inst-aes64ks1i-behavior
-  -normative: true
-  -text: |
-    This instruction implements the rotation, SubBytes and Round Constant addition steps of the AES
-    block cipher Key Schedule. This instruction must _always_ be implemented such that its execution
-    latency does not depend on the data being operated on. Note that `rnum` must be in the range
-    `0x0..0xA`. The values `0xB..0xF` are reserved.
+description:
+  - id: inst-aes64ks1i-behavior
+    normative: true
+    text: |
+      This instruction implements the rotation, SubBytes and Round Constant addition steps of the AES
+      block cipher Key Schedule.
+  - id: inst-aes64ks1i-range
+    normative: true
+    text: |
+      `rnum` must be in the range `0x0..0xA`. The values `0xB..0xF` are reserved.
 definedBy:
   anyOf: [Zknd, Zkne]
 base: 64

@@ -4,13 +4,12 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64ks2
 long_name: AES Key Schedule Instruction 2
-description: |
-  -id: instr-aes64ks2-behavior
-  -normative: true
-  -text: |
-    This instruction implements the additional XOR'ing of key words as part of the AES block cipher
-    Key Schedule. This instruction must _always_ be implemented such that its execution latency does
-    not depend on the data being operated on.
+description:
+  - id: instr-aes64ks2-behavior
+    normative: true
+    text: |
+      This instruction implements the additional XOR'ing of key words as part of the AES block cipher
+      Key Schedule.
 definedBy:
   anyOf: [Zknd, Zkne]
 base: 64

@@ -4,13 +4,12 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64ds
 long_name: AES decrypt final round
-description: |
-  -id: inst-aes64ds-behavior
-  -normative: true
-  -text: |
-    Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
-    round output, applying the Inverse ShiftRows and SubBytes steps. This instruction must _always_ be
-    implemented such that its execution latency does not depend on the data being operated on.
+description:
+  - id: inst-aes64ds-behavior
+    normative: true
+    text: |
+      Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
+      round output, applying the Inverse ShiftRows and SubBytes steps.
 definedBy: Zknd
 base: 64
 assembly: xd, xs1, xs2

@@ -4,13 +4,12 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64dsm
 long_name: AES decrypt middle round
-description: |
-  -id: inst-aes64dsm-behavior
-  -normative: true
-  -text: |
-    Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
-    round output, applying the Inverse ShiftRows, SubBytes and MixColumns steps. This instruction must _always_
-    be implemented such that its execution latency does not depend on the data being operated on.
+description:
+  - id: inst-aes64dsm-behavior
+    normative: true
+    text: |
+      Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
+      round output, applying the Inverse ShiftRows, SubBytes and MixColumns steps.
 definedBy: Zknd
 base: 64
 assembly: xd, xs1, xs2

@@ -4,15 +4,13 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64im
 long_name: AES Decrypt KeySchedule MixColumns
-description: |
-  -id: inst-aes64im-behavior
-  -normative: true
-  -text: |
-    The instruction applies the inverse MixColumns transformation to two columns of the state array,
-    packed into a single 64-bit register. It is used to create the inverse cipher KeySchedule, according to
-    the equivalent inverse cipher construction in (NIST, 2001) (Page 23, Section 5.3.5). This
-    instruction must always be implemented such that its execution latency does not depend on the
-    data being operated on.
+description:
+  - id: inst-aes64im-behavior
+    normative: true
+    text: |
+      The instruction applies the inverse MixColumns transformation to two columns of the state array,
+      packed into a single 64-bit register. It is used to create the inverse cipher KeySchedule, according to
+      the equivalent inverse cipher construction in (NIST, 2001) (Page 23, Section 5.3.5).
 definedBy: Zknd
 base: 64
 assembly: xd, xs1

@@ -4,13 +4,12 @@ $schema: "inst_schema.json#"
 kind: instruction
 name: aes64es
 long_name: AES encrypt final round
-description: |
-  -id: inst-aes64es-behavior
-  -normative: true
-  -text: |
-    Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
-    round output, applying the ShiftRows and SubBytes steps. This instruction must _always_ be
-    implemented such that its execution latency does not depend on the data being operated on.
+description:
+  - id: inst-aes64es-behavior
+    normative: true
+    text: |
+      Uses the two 64-bit source registers to represent the entire AES state, and produces _half_ of the next
+      round output, applying the ShiftRows and SubBytes steps.
 definedBy: Zkne
 base: 64
 assembly: xd, xs1, xs2