From 9cfaeb6660344fb74903294af06b022199ba9e14 Mon Sep 17 00:00:00 2001 From: aottaviano Date: Tue, 23 Jan 2024 23:31:35 +0100 Subject: [PATCH] docs --- docs/img/arch.svg | 2 +- docs/um/sw.md | 61 ++++++++++++++++++++++++----------------------- 2 files changed, 32 insertions(+), 31 deletions(-) diff --git a/docs/img/arch.svg b/docs/img/arch.svg index 82e97a623..6634abd80 100644 --- a/docs/img/arch.svg +++ b/docs/img/arch.svg @@ -1,4 +1,4 @@ -
Partitionable hybrid LLC/SPM
Partitionable...
HyperRAM
controller
HyperRAM...
Host Domain (Cheshire)
Host Domain (Cheshire)
System Bus (64-bit AXI4 matrix)
System Bus (64-bit AXI4 matrix)
DMA
DMA
JTAG Debug
JTAG Debug
System Bus (TLUL)
System Bus (TLUL)
Ibex RV32
Ibex RV...
PLIC
PLIC
Life Cyc.
Life Cyc...
Main SPM
Main SPM
OTP mem.
OTP mem.
Ibex RV32
Ibex RV...
Dual Lockstep
Dual Lockstep
RNG
RNG
AES128
AES128
SHA2
SHA2
OTBN
OTBN
KMAC
KMAC
HMAC
HMAC
Crypto DSAs
Crypto DSAs
CAN x1
CAN x1
ETH x1
ETH x1
Generic
Timers
Generic...
WDT x1
WDT x1
Low-latency TCDM bus
Low-latency TCDM bus
DMA
DMA
SPM interleaved (M banks)
SPM interleaved (M banks)
I$
I$
VRF
VRF
FPU0
FPU0
Integer PMCA
Integer PMCA
PCRs
PCRs
Mailbox
Unit
Mailbox...
PWM
Timers
PWM...
CV32
RT
CV32...
System bus (OBI)
System bus (OBI)
Peripheral bus (Regbus/APB)
Peripheral bus (Regbus/APB)
PCRs
PCRs
Boot
ROM
Boot...
ECC
Mgr
ECC...
Private
DSPM + ECC
Private...
Triple-Core-Lockstep
Triple-Core-Lockstep
CLIC
CLIC
JTAG
Dbg
JTAG...
Dyn. addressing switch
Dyn. addressing switch
Bank group 0
Bank group 0
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
Bank group arbiter
Bank group arbiter
Peripheral bus (Regbus/APB)
Peripheral bus (Regbus/APB)
Boot & Host interfaces
Boot & Host interfaces
UART x1
UART x1
QSPIM x1
QSPIM x1
I2CM x1
I2CM x1
USB x1
USB x1
PLIC x1
PLIC x1
CLINT/CLIC (per-core)
CLINT/CLIC (...
Intr. Router
Intr. Router
BootROM
BootROM
Host PCRs
Host PCRs
L1 D$ & I$
L1 D$ & I$
FPU
FPU
MMU
MMU
CVA6RT
RV64GCH
CVA6RT...
Peripheral bus (TLUL)
Peripheral bus (TLUL)
Timers
Timers
OTP ctrl.
OTP ctrl...
UART
UART
Pwr. mgr.
Pwr. mgr.
Clk. mgr.
Clk. mgr.
Rst. mgr.
Rst. mgr.
Key mngr.
Key mng...
DMA
DMA
WDT
WDT
SPIM
SPIM
Alert
Alert
FPU
FPU
MMU
MMU
CVA6RT
RV64GCH
CVA6RT...
Self-inv. cache coher.
Self-inv. cache coh...
Serial Link
Serial Link
Peripheral bus (APB)
Peripheral bus (APB)
Peripherals
Peripherals
AON
AON
JTAG Dbg
JTAG D...
CV32
RT
CV32...
CV32
RT
CV32...
FPU
FPU
FPU
FPU
FPU
FPU
Gen. Timer
Gen. T...
Private
ISPM + ECC
Private...
BootROM
BootROM
L1 D$ & I$
L1 D$ & I$
Self-inv. cache coher.
Self-inv. cache coh...
Bank group N-1
Bank group N-1
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
FPU3
FPU3
IPU
IPU
Snitch
RV32
Snitch...
Spatz RVV coproc.
Spatz RVV...
Snitch
RV32
Snitch...
VRF
VRF
FPU0
FPU0
FPU3
FPU3
IPU
IPU
Snitch
RV32
Snitch...
Spatz RVV coproc.
Spatz RVV...
CV32
#0
CV32...
Tensor
DSA
(RedMulE)
Tensor...
Low-latency TCDM bus
Low-latency TCDM bus
DMA
DMA
SPM interleaved (M banks)
SPM interleaved (M banks)
I$
I$
HMR Mngr.
HMR Mngr.
Peripherals
Peripherals
Vector PMCA
Vector PMCA
Accelerator Domain
Accelerator Domain
Dynamic SPM
Dynamic SPM
Safe domain
Safe domain
Secure domain (OpenTitan)
Secure domain (OpenTitan)
CV32
#1
CV32...
CV32
#11
CV32...
HMR cluster
HMR cluster
Legend
Legend
AXI-REALM Guard & Cfg.
AXI-REALM...
GPIO x32
GPIO x32
AXI-REALM real-time monitoring and regulation unit for managers
AXI-REALM real-time monitoring and regulation unit for managers
Bus protocol adapter
Bus protocol adapter
Text is not SVG - cannot display
\ No newline at end of file +
Partitionable hybrid LLC/SPM
Partitionable...
HyperRAM
controller
HyperRAM...
Host Domain (Cheshire)
Host Domain (Cheshire)
System Bus (64-bit AXI4 matrix)
System Bus (64-bit AXI4 matrix)
DMA
DMA
JTAG Debug
JTAG Debug
System Bus (TLUL)
System Bus (TLUL)
Ibex RV32
Ibex RV...
PLIC
PLIC
Life Cyc.
Life Cyc...
Main SPM
Main SPM
OTP mem.
OTP mem.
Ibex RV32
Ibex RV...
Dual Lockstep
Dual Lockstep
RNG
RNG
AES128
AES128
SHA2
SHA2
OTBN
OTBN
KMAC
KMAC
HMAC
HMAC
Crypto DSAs
Crypto DSAs
CAN x1
CAN x1
ETH x1
ETH x1
Generic
Timers
Generic...
WDT x1
WDT x1
Low-latency TCDM bus
Low-latency TCDM bus
DMA
DMA
SPM interleaved (M banks)
SPM interleaved (M banks)
I$
I$
VRF
VRF
FPU0
FPU0
Integer PMCA
Integer PMCA
PCRs
PCRs
Mailbox
Unit
Mailbox...
PWM
Timers
PWM...
CV32
RT
CV32...
System bus (OBI)
System bus (OBI)
Peripheral bus (Regbus/APB)
Peripheral bus (Regbus/APB)
PCRs
PCRs
Boot
ROM
Boot...
ECC
Mgr
ECC...
Private
DSPM + ECC
Private...
Triple-Core-Lockstep
Triple-Core-Lockstep
CLIC
CLIC
JTAG
Dbg
JTAG...
Dyn. addressing switch
Dyn. addressing switch
Bank group 0
Bank group 0
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
Bank group arbiter
Bank group arbiter
Peripheral bus (Regbus/APB)
Peripheral bus (Regbus/APB)
Boot & Host interfaces
Boot & Host interfaces
UART x1
UART x1
QSPIM x1
QSPIM x1
I2CM x1
I2CM x1
USB x1
USB x1
PLIC x1
PLIC x1
CLINT/CLIC (per-core)
CLINT/CLIC (...
Intr. Router
Intr. Router
BootROM
BootROM
Host PCRs
Host PCRs
L1 D$ & I$
L1 D$ & I$
FPU
FPU
MMU
MMU
CVA6RT
RV64GCH
CVA6RT...
Peripheral bus (TLUL)
Peripheral bus (TLUL)
Timers
Timers
OTP ctrl.
OTP ctrl...
UART
UART
Pwr. mgr.
Pwr. mgr.
Clk. mgr.
Clk. mgr.
Rst. mgr.
Rst. mgr.
Key mngr.
Key mng...
DMA
DMA
WDT
WDT
SPIM
SPIM
Alert
Alert
FPU
FPU
MMU
MMU
CVA6RT
RV64GCH
CVA6RT...
Self-inv. cache coher.
Self-inv. cache coh...
Serial Link
Serial Link
Peripheral bus (Regbus/APB)
Peripheral bus (Regbus/APB)
Peripherals
Peripherals
AON
AON
JTAG Dbg
JTAG D...
CV32
RT
CV32...
CV32
RT
CV32...
FPU
FPU
FPU
FPU
FPU
FPU
Gen. Timer
Gen. T...
Private
ISPM + ECC
Private...
BootROM
BootROM
L1 D$ & I$
L1 D$ & I$
Self-inv. cache coher.
Self-inv. cache coh...
Bank group N-1
Bank group N-1
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
SPM bank + ECC
FPU3
FPU3
IPU
IPU
Snitch
RV32
Snitch...
Spatz RVV coproc.
Spatz RVV...
Snitch
RV32
Snitch...
VRF
VRF
FPU0
FPU0
FPU3
FPU3
IPU
IPU
Snitch
RV32
Snitch...
Spatz RVV coproc.
Spatz RVV...
CV32
#0
CV32...
Tensor
DSA
(RedMulE)
Tensor...
Low-latency TCDM bus
Low-latency TCDM bus
DMA
DMA
SPM interleaved (M banks)
SPM interleaved (M banks)
I$
I$
HMR Mngr.
HMR Mngr.
Peripherals
Peripherals
Vector PMCA
Vector PMCA
Accelerator Domain
Accelerator Domain
Dynamic SPM
Dynamic SPM
Safe domain
Safe domain
Secure domain (OpenTitan)
Secure domain (OpenTitan)
CV32
#1
CV32...
CV32
#11
CV32...
HMR cluster
HMR cluster
Legend
Legend
AXI-REALM Guard & Cfg.
AXI-REALM...
GPIO x32
GPIO x32
AXI-REALM real-time monitoring and regulation unit for managers
AXI-REALM real-time monitoring and regulation unit for managers
Bus protocol adapter
Bus protocol adapter
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/um/sw.md b/docs/um/sw.md index 0cc7e4fc2..6524a41ae 100644 --- a/docs/um/sw.md +++ b/docs/um/sw.md @@ -32,10 +32,9 @@ This means that it shares the same: Therefore, we defer the reader to Cheshire's Software Stack description for more information. -Programs compiled for Carfield that requiree additional, Carfield-specific drivers (for domains' -offload, peripheral control, etc) are linked against Cheshire's static library (`libcheshire.a`). -This operation is transparent to the programmer, that can take advantage of Cheshire's device -drivers and SW routines within Carfield seamlessly. +Programs compiled for Carfield are linked against Cheshire's static library (`libcheshire.a`). This +operation is transparent to the programmer, that can take advantage of Cheshire's device drivers and +SW routines within Carfield seamlessly. Provided the equivalence and reuse between Carfield and Cheshire, in this page we focus on Carfield-specific SW components and build flow, with an emphasis on domains different than Cheshire. @@ -46,9 +45,9 @@ General-purpose processing elements (PEs) integrated in Carfield implement the R either RV64 (*host domain*) or RV32 (all the others: *safe domain*, *secure domain*, *integer PMCA*, and *vectorial PMCA*). -To build programs written in plain C for a Carfield domain with the base ISA and its regular -extensions (namely, `RV64G` and `RV32IMACF`) *without* using *custom* extensions that each domain -provide, you simply need vanilla RV64 and RV32 compilers. +To build programs for a Carfield domain with the base ISA and its regular extensions (namely, +`RV64G` and `RV32IMACF`) *without* using *custom* extensions that each domain provide, you simply +need vanilla RV64 and RV32 compilers. Otherwise, to use *custom* instruction supported in HW for a domain, specific compiler support is required. We are working to improve compiler support by providing pointers to pre-built releases or @@ -96,12 +95,12 @@ supported by the platform, as described in [Simulation](../tg/sim.md) or on FPGA As in Cheshire, Carfield programs can be created to be executed from several memory locations: -* Dynamic SPM (`l2`): the linkerscript is provided in Carfield's `sw/link/` folder, since Dynamic - SPM is not integrated in the minimal Cheshire -* LLC SPM (`spm`): valid when the LLC is configured as such. In Carfield, half of the LLC is +* Dynamic SPM (`*.l2.elf`): the linkerscript is provided in Carfield's `sw/link/` folder, since + Dynamic SPM is not integrated in the minimal Cheshire +* LLC SPM (`*.spm.elf`): valid when the LLC is configured as such. In Carfield, half of the LLC is configured as SPM from the boot ROM during system bringup, as this is the default behavior in Cheshire. -* DRAM (`dram`): the HyperRAM +* DRAM (`*.dram.elf`): the off-chip DRAM, e.g., the HyperRAM For example, to build a specific BMP (here `sw/tests/bare-metal/hostd/helloworld.c` to be run on Cheshire) executing from the Dynamic SPM, run: @@ -112,17 +111,18 @@ make sw/tests/bare-metal/hostd/helloworld.car.l2.elf To create the same program executing from DRAM, `sw/tests/bare-metal/hostd/helloworld.car.dram.elf` can instead be built from the same source. Depending on their assumptions and behavior, not all -programs may be built to execute from both locations. +programs may be built to execute from all locations. -### Linux programs +### GPOS (e.g., Linux) programs -When executing *host domain* programs in Linux (on FPGA/ASIC targets) that require access to memory -mapped components of other domains, SW intervention is needed to map virtual to physical addresses, -since domains different than the host *currently* lack support for HW-based virtual memory -translation. +When executing *host domain* programs on a GPOS such as Linux (on FPGA/ASIC targets) requiring +access to memory mapped components of other domains, SW intervention is needed to map virtual to +physical addresses, since domains different than the host *currently* lack support for HW-based +virtual memory translation. -In the current SW stack, this mapping is already provided and hence transparent to the user. Test -programs targeting Linux that require it are located in different folder, `sw/tests/linux/`. +In the current SW stack, this mapping is already provided and hence transparent to the user. For +example, test programs targeting Linux that require it are located in different folder, +`sw/tests/linux/`. ## Inter-domain offload @@ -134,14 +134,14 @@ Offload of programs to Carfield domains involves: Programs can be offloaded with: -* **Simple baremetal offload (BMO)**, recommended for regression tests use cases that are simple - enough to be executed with cycle-accurate RTL simulations. For instance, this can be the case of - dynamic timing analysis (DTA) carried out during an ASIC development cycle. +* **Simple baremetal offload (BMO)**, useful for regression tests that are simple enough to be + executed with cycle-accurate RTL simulations. For instance, this can be the case of dynamic timing + analysis (DTA) carried out during an ASIC development cycle. * **The [OpenMP](https://www.openmp.org/) API**, recommended when developing SW for Carfield on a FPGA or, eventually, ASIC implementing Carfield, because of the ready-to-use OS support - (currently, Linux). Usage of the OpenMP API with non OS-directed (baremetal) SW can be supported, - but is mostly suited for heterogeneous embedded systems with highly constrained resources + (currently, Linux). Note that usage of the OpenMP API with non OS-directed (baremetal) SW can be + supported, and would eventually replace the BMO described above. In the following, we briefly describe both. @@ -151,16 +151,16 @@ In the following, we briefly describe both. Since by default all domains are clock gated and isolated after POR except for the *host domain* (Cheshire), as described in [Architecture](../um/arch.md), the wake-up process must be handled from -the C source code. +the application source code. -### Baremetal offload +### Baremetal offload (non OpenMP based) For BMO, the offloader takes care of bootstrapping the target device ELF in the correct memory location, initializing the target and launching its execution through a simple ELF Loader. The ELF Loader source code is located in the offloader's SW directory, and follows a naming convention: ``` -_offloader_.c +_offloader_.c ``` The target device's ELF is included into the offloader's ELF Loader as a *header file*. The target @@ -192,7 +192,7 @@ As an example, assume the *host domain* as offloader and the *integer PMCA* as t The resulting offloader ELF's name reads: ``` -_offloader_..car..elf +_offloader_..car..elf ``` According to the memory location where the BMP will be executed. @@ -202,13 +202,14 @@ The final offloader ELF can be preloaded with simulation methods described in th --- -*Note for the reader* +**Note for the reader** BMO is in general not recommended for developing SW for Carfield, as it was introduced during ASIC development cycle and can be an effective litmus test to find and fix HW bugs, or during DTA. For SW development on Carfield and in particular domain-driven offload, it is recommended to use -OpenMP offload on FPGA/ASIC, described below. +OpenMP offload on FPGA/ASIC, described below. The latter will eventually replace the simple BMO also +for baremetal regression checks in future releases of the project. ### OpenMP offload (recommended: use on FPGA/ASIC)