diff --git a/search/search_index.json b/search/search_index.json index 2995dc7a..d5caa6f2 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Carfield","text":"
Carfield is a mixed-criticality SoC targeting the automotive domain. It uses Cheshire
as main host domain.
Carfield is developed as part of the PULP project, a joint effort between ETH Zurich and the University of Bologna.
"},{"location":"#motivation","title":"Motivation","text":"The rapid evolution of AI algorithms, the massive amount of sensed data and the pervasive influence of AI-enhanced applications across application-domains such as Automotive, Space and Cyber-Physical embedded systems (CPSs), call for a paradigm shift from simple micro-controllers towards powerful and heterogeneous edge computers in the design of next generation of mixed-criticality systems (MCSs). These must not only deliver outstanding performance and energy efficiency but also ensure steadfast safety, resilience, and security.
The Carfield platform aims to tackle these architectural challenges establishing itself as a pre-competitive heterogeneous platform for MCSs, underpinned by fully open-source Intellectual Properties (IPs). Carfield showcases pioneering hardware solutions, addressing challenges related to time-predictable on/off-chip communication, robust fault recovery mechanisms, secure boot processes, cryptographic acceleration services, hardware-assisted virtualization, and accelerated computation for both floating-point and integer workloads.
"},{"location":"#quick-start","title":"Quick Start","text":"If you are impatient and have all needed dependencies, you can run:
make car-all\n
and then run a simulation by typing:
make car-hw-build car-hw-sim CHS_BINARY=./sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
"},{"location":"#license","title":"License","text":"Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All hardware sources and tool scripts are licensed under the Solderpad Hardware License 0.51 (see LICENSE
) with the exception of generated register file code (e.g. hw/regs/*.sv
), which is generated by a fork of lowRISC's regtool
and licensed under Apache 2.0. All software sources are licensed under Apache 2.0.
We first discuss the Carfield's project structure, its dependencies, and how to build it.
"},{"location":"gs/#repository-structure","title":"Repository structure","text":"The project is structured as follows:
Directory Description Documentationdoc
Documentation Home hw
Hardware sources as SystemVerilog RTL Architecture sw
Software stack, build setup, and tests Software Stack target
Simulation, FPGA, and ASIC target setups Targets utils
Utility scripts scripts
Some helper scripts for env setup"},{"location":"gs/#dependencies","title":"Dependencies","text":"To build Carfield, you will need:
>= 3.82
>= 3.9
>= 0.27.1
>= 11.2.0
requirements.txt
We use Bender for hardware IP and dependency management; for more information on using Bender, please see its documentation. You can install Bender directly through the Rust package manager Cargo:
cargo install bender\n
Depending on your desired target, additional dependencies may be needed.
"},{"location":"gs/#building-carfield","title":"Building Carfield","text":"To build different parts of Carfield, the carfield.mk
run make
followed by these targets:
car-hw-init
: generated hardware, including IPs and boot ROMcar-sim-init
(\u2020): scripts and external models for simulationcar-sw-build
(\u2021): bare-metal software running on the hardware\u2020 car-sim-init
will download externally provided peripheral simulation models, some proprietary and with non-free license terms, from their publically accessible sources. By running car-sim-init
, you accept this.
\u2021 car-sw-build
requires RV64 and RV32 toolchains. See the Software Stack for more details.
To run all build targets above (\u2020)(\u2021):
make car-init\n
Running car-init
is required at least once to correctly configure IPs we depend on. On reconfiguring any generated hardware or changing IP versions, car-init
should be rerun.
The following additional targets are not invoked by the above, but also available:
chs-bootrom-all
- rebuilds Cheshire's boot ROM. This is not done by default as reproducible builds (as checked by CI) can only be guaranteed for fixed compiler versions.car-nonfree-init
- clones our internal repository with nonfree resources we cannot release, including our internal CI or technology-specific standard cells, scripts and tools. This is not necessary to use Carfield.Carfield uses Cheshire
as main dependency. Compared to the other dependencies, Cheshire provides most of the HW/SW infrastructure used by Carfield. All Cheshire's make
targets, described in the dedicated documentation, are available in Carfield through the inclusion of the makefrag cheshire.mk
in carfield.mk
.
A target is an end use for Carfield. Each target requires different steps from here; read the page for your desired target in the following Targets chapter.
"},{"location":"tg/","title":"Targets","text":"A target refers to an end use of Carfield. This could be a simulation setup, an FPGA or ASIC implementation, or the less common integration into other SoCs.
Target setups can either be included in this repository or live in an external repository and use Cheshire as a dependency.
"},{"location":"tg/#included-targets","title":"Included Targets","text":"Included target setups live in the target
directory. Each included target has a documentation page in this chapter:
For ASIC implementation target, where an additional wrapper is needed for clock generation blocks, bidirectional pads or additional circuitry, or the less common integration into larger SoCs, Carfield may be included either as a Bender dependency or Git submodule. For further information and best pratices, see SoC Integration.
"},{"location":"tg/integr/","title":"SoC Integration","text":"Carfield is a complex platform, therefore the case of it being integrated in larger SoCs is rare. A more common scenario is the use of Carfield in a ASIC wrapper that includes bidirectional pads, clock generation blocks (PLLs, FLLs...) or other circuitry.
This page explain how to integrate Carfield to fulfill on of these needs. Since Carfield heavily relies on Cheshire, for better understanding we suggest to integrate this reading with its equivalent in the Cheshire's documentation.
"},{"location":"tg/integr/#using-carfield-in-your-project","title":"Using Carfield In Your Project","text":"As for internal targets, Carfield must be built before use in external projects. We aim to simplify this as much as possible with a portable make fragment, carfield.mk
.
If you use GNU Make to build your project and Bender to handle dependencies, you can include the Carfield build system into your own makefile with:
include $(shell bender path carfield)/carfield.mk\n
All of Carfield's build targets are available with the prefix car-
.
You can leverage this to ensure your Carfield build is up to date and rebuild hardware and software whenever necessary. You can change the default value of any build parameter, replace source files to adapt Carfield, or reuse parts of its build system, such as the software stack or the register and ROM generators.
"},{"location":"tg/integr/#instantiating-carfield","title":"Instantiating Carfield","text":"A minimal clean instantiation would look as follows:
`include \"cheshire/typedef.svh\"\n\n// Define function to derive configuration from defaults.\n// This could also (preferrably) be done in a system package.\nfunction automatic cheshire_pkg::cheshire_cfg_t gen_cheshire_cfg();\n cheshire_pkg::cheshire_cfg_t ret = cheshire_pkg::DefaultCfg;\n // Make overriding changes. Here, we add two AXI manager ports\n ret.AxiExtNumMst = 2;\n return ret;\nendfunction\n\nlocalparam cheshire_cfg_t CheshireCfg = gen_cheshire_cfg();\n\n// Generate interface types prefixed by `csh_` from our configuration.\n`CHESHIRE_TYPEDEF_ALL(csh_, CheshireCfg)\n\n// Instantiate Cheshire with our configuration and interface types.\n carfield #(\n .Cfg ( DutCfg ),\n .HypNumPhys ( NumPhys ),\n .HypNumChips ( NumChips ),\n .reg_req_t ( reg_req_t ),\n .reg_rsp_t ( reg_rsp_t )\n ) dut (\n // ... IOs here ...\n );\n
"},{"location":"tg/integr/#verifying-cheshire-in-system","title":"Verifying Cheshire In-System","text":"To simplify the simulation and verification of Carfield in other systems, we provide a monolithic block of verification IPs called carfield_vip
. This is used along with the X_vip
modules of other domains, such as Cheshire, Safe domain and Secure domain. Their description can be found in the associated domain's documentation. In particular, carfield_ip
currently includes:
Additionally, we provide a module carfield_vip_tristate
which adapts the unidirectional IO of this module to bidirectional IOs which may be interfaced with pads where necessary.
This page describes how to simulate Carfield to execute baremetal programs. Please first read Getting Started to make sure you have all the dependencies and initialized your repository.
We currently provide working setups for:
>= 2022.3
We plan on supporting more simulators in the future. If your situation requires it, simulating Carfield on other setups should be straightforward.
"},{"location":"tg/sim/#testbench","title":"Testbench","text":"Carfield comprises several bootable domains, that are described in the Architecture section.
Each of these domains can be independently booted by keeping the rest of the SoC asleep through the domain JTAG, or Cheshire's JTAG and Serial Link, which have access to the whole platform except for the secure domain.
Alternatively, some domains can offload baremetal programs to other domains at runtime. This is common pratice when offloading programs to the accelerator domain from the host or safe domains.
Note that while runtime offloading can be exploited by RTL simulation with reasonably-sized programs, we suggest to follow the FPGA mapping steps and use OpenMP-based offload with heterogeneous cross-compilation.
We provide a single SystemVerilog testbench for carfield_soc
that handles standalone execution of baremetal programs for each domain. The code for domain X
is preloaded through simulated interface drivers. In addition, some domains can read from external memory models from their boot ROM and then jump to execution.
As for Cheshire, Carfield testbench employs physical interfaces (JTAG or Serial Link) for memory preload by default. This could increase the memory preload time (independently from the target memory: dynamic SPM, LLC-SPM, or DRAM), significantly based on the ELF size.
Since by default all domains are clock gated and isolated after POR except for the host domain (Cheshire), as described in Architecture, the testbench handles the wake-up process.
To speed up the process, the external DRAM can be initialized in simulation (namely, at time 0ns
) for domain X
through the make variable HYP_USER_PRELOAD
. Carfield SW Stack provides automatic generation of the required *.slm
files, targeting an HyperRAM configured with two physical chips. Note, this flow is not recommended during ASIC development cycle as it may hide bugs in the physical interfaces.
X
X_BOOTMODE
X_PRELMODE
Action CHS
, SAFED
, SECD
, PULPD
, SPATZD
0 0 Preload through JTAG CHS
, SAFED
, SECD
, PULPD
, SPATZD
0 1 Preload through serial link Preloading boot modes expect an ELF executable to be passed through X_BINARY
.
X
CHS_BOOTMODE
CHS_PRELMODE
Action CHS
0 2 Preload through UART CHS
1-3 - Autonomous boot, see Boot ROM Autonomous boot modes expect a disk image (GPT formatted or raw code) to be passed through X_IMAGE
. For more information on how to build software for Carfield and the details on the boot process of each domain, see Software Stack.
For simulation of Carfield in other designs, or in ASIC wrappers that reside in other repositories, we provide the module carfield_vip
encapsulating all verification IPs and their interfaces.
After building Carfield, the design can be compiled and simulated with QuestaSim. Below, we provide an example with Serial Link
passive preload of a baremetal program helloworld.car.l2.elf
to be executed on the host domain (Cheshire, i.e., X=CHS
):
# Compile design\nmake car-hw-build\n\n# Preload `helloworld.car.l2.elf` through serial link, then start and run simulation\nmake car-hw-sim CHS_BOOTMODE=0 CHS_PRELMODE=1 CHS_BINARY=./sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
The design needs to be recompiled only when hardware is changed.
"},{"location":"tg/sim/#debugging","title":"Debugging","text":"Per default, Questasim compilation is performance-optimised, and GUI and simulation logging are disabled. To enable full visibility, logging, and the Questa GUI, set DEBUG=1
when executing the steps above.
Currently, synthesis of Carfield is available with closed source tools, and hence its scripts are added in the nonfree
repository mentioned in the Getting Started section.
Once open-EDA and open-PDK flow is available, it will be updated in this page.
For independent synthesis of carfield by external users, we provide a wrapper under target/synth/carfield_synth_wrap.sv
.
This page describes how to map Carfield on Xilinx FPGAs to execute baremetal programs or boot CVA6 Linux. Please first read Getting Started to make sure have all dependencies. Additionally, for on-chip debugging you need:
>= 0.10.0
We currently provide working setups for:
>= 2020.2
We are working on support for more boards in the future.
The Carfield bitstreams are divided in two flavors at the moment. The flavor_vanilla
and the flavor_bd
.
flavor_vanilla
- The hardware to be mapped on the FPGA is fully described in System Verilog. This flow is lightweight, easily reproducible, and self contained. As each IP is integrated by hand in the RTL, only the Xilinx DDR, Xilinx VIO and Xilinx clock wizard are available (at the moment).
flavor_bd
- In order to allow for more complex top level, this flow relies on the Vivado block design flow to link Carfield with external IPs. This flow is less human readable but allows integrating more complex IPs as Xilinx Ethernet. Note that this may require you to own the respective licenses.
Due to the structure of the Makefile flow. All the following commands are to be executed at the root of the Carfield repository. If you want to see the Makefiles that you will be using, you can find the generic FPGA rules in target/xilinx/xilinx.mk
and the vanilla specific rules in target/xilinx/flavor_vanilla/flavor_vanilla.mk
.
First, make sure that you have fetch and generated all the RTL:
make car-init\n
Generate the bitstream in target/xilinx/out/
by running:
make car-xil-all XILINX_FLAVOR=vanilla [VIVADO=version]\n[VIVADO_MODE={batch,gui}] [XILINX_BOARD={vcu128}] [NO_HYPERBUS={0,1}]\n[GEN_EXT_JTAG={0,1}] [GEN_PULP_CLUSTER={0,1}] [GEN_SAFETY_ISLAND={0,1}]\n[GEN_SPATZ_CLUSTER={0,1}] [GEN_OPEN_TITAN={0,1}]\n
See the argument list below:
Argument Relevance Description VIVADO all Vivado command to use XILINX_BOARD allvcu128
NO_HYPERBUS all 0
Use the hyperram controller inside carfield.sv
1
Use the Xilinx DDR controller GEN_EXT_JTAG vcu128 0
Connect the JTAG debugger to the board's JTAG (see vcu128) 1
Connect the JTAG debugger to an external JTAG chain GEN_[IP] all 0
Replace the IP with an AXI error slave1
Instanciate the IP VIVADO_MODE all batch
Compile in Vivado shellgui
Compile in Vivado gui See below some typical building time for reference:
IPs Board Duration PULP vcu128 xxhxxmin SAFETY vcu128 xxhxxmin SPATZ vcu128 xxhxxmin PULP + SAFETY vcu128 xxhxxminYou can find which sources are used by looking at Bender.yml
(target all(xilinx, fpga, xilinx_vanilla)
). This file is used by bender to generate target/xilinx/flavor_vanilla/scripts/add_sources.tcl
. You can open this file to see all the file list of the project. (Note that even if you disable an IP, its files will still needed by Vivado and added to the add_sources.tcl
).
Note that the make
command above will first compile the Xilinx ips located in target/xilinx/xilinx_ips
before compiling the bitstream.
Please read and try to compile a vanilla bitstream first to identify potential issues.
You can find the bd specific rules in target/xilinx/flavor_vanilla/flavor_bd.mk
.
Again, make sure that you have fetched and generated all the RTL:
make car-init\n
Generate the bitstream in target/xilinx/out/
by running:
make car-xil-all XILINX_FLAVOR=bd [VIVADO=version] [VIVADO_MODE={batch,gui}]\n[XILINX_BOARD={vcu128}] [NO_HYPERBUS={0,1}] [GEN_EXT_JTAG={0,1}]\n[GEN_PULP_CLUSTER={0,1}] [GEN_SAFETY_ISLAND={0,1}] [GEN_SPATZ_CLUSTER={0,1}]\n[GEN_OPEN_TITAN={0,1}]\n
See the argument list below:
Argument Relevance Description VIVADO all Vivado command to use XILINX_BOARD allvcu128
NO_HYPERBUS all 0
Use the hyperram controller inside carfield.sv
1
Use the Xilinx DDR controller GEN_EXT_JTAG vcu128 0
Connect the JTAG debugger to the board's JTAG (see vcu128) 1
Connect the JTAG debugger to an external JTAG chain GEN_[IP] all 0
Replace the IP with an AXI error slave1
Instanciate the IP VIVADO_MODE all batch
Compile in Vivado shellgui
Compile in Vivado gui See below some typical building time for reference:
IPs Board Duration PULP vcu128 xxhxxmin SAFETY vcu128 xxhxxmin SPATZ vcu128 xxhxxmin PULP + SAFETY vcu128 xxhxxminYou can find which sources are used by looking at Bender.yml
(target all(xilinx, fpga, xilinx_bd)
). This file is used by bender to generate target/xilinx/flavor_bd/scripts/add_sources.tcl
. You can open this file to see all the file list of the project. (Note that even if you disable an IP, its files will still needed by Vivado and added to the add_sources.tcl
).
Note that the make
command above will first package a Carfield ip before compiling the bitstream.
As there are no switches on this board, the CVA6 bootmode (see Cheshire bootrom) is selected by Xilinx VIOs that can be set in the Vivado GUI (see Using Vivado GUI).
"},{"location":"tg/xilinx/#external-jtag-chain","title":"External JTAG chain","text":"The VCU128 development board only provides one JTAG chain, used by Vivado to program the bitstream, and interact with certain Xilinx IPs (ILAs, VIOs, ...). The RV64 requires access to a JTAG chain to connect GDB to the debug-module in the bitstream.
When using EXT_JTAG=0
it is possible to connect the debug module to the internal FPGA's JTAG by using the Xilinx BSCANE macro. With this, you will only need the normal Xilinx USB cable to interact with CVA6. Note that it means that Vivado and OpenOCD can not use the same cable at the same time. WARNING: this setup (with EXT_JTAG=0
) will only work for designs containing the host only as it is not possible to chain multiple devices on the BSCANE macro. If you need to use EXT_JTAG=0
consider modifying the RTL to remove the debug modules of the IPs.
When using EXT_JTAG=1
we add an external JTAG chain for the RV64 host and other island through the FPGA's GPIOs. Since the VCU128 does not have GPIOs we use we use a Digilent JTAG-HS2 cable connected to the Xilinx XM105 FMC debug card. See the connections in vcu128.xdc
.
If you have closed Vivado, or compiled in batch mode, you can open the Vivado GUI:
# Find your project\nfind . -name \"*.xpr\"\n# Open it in gui\nvitis-2020.2 vivado project.xpr\n
You can now open the Hardware Manager and program the FPGA. Once done, Vivado will give you access the to Virtual Inputs Outputs (VIOs). You can now assert the following signals (on Cheshire top level).
VIO Function vio_reset Positive edge-sensitive reset for the whole system vio_boot_mode Override the boot-mode switches described above vio_boot_mode_sel Select between 0: using boot mode switches 1: use boot mode VIO"},{"location":"tg/xilinx/#using-command-line","title":"Using command line","text":"A script program.tcl
is available to flash the bitstream without opening Vivado GUI. You will need to give the following variable to access your board (see target/xilinx/xilinx.mk
).
XILINX_PORT
- Vivado opened port (default 3121)FPGA_PATH
- Vivado path to your FPGA (default xilinx_tcf/Xilinx/[serial_id])XILINX_HOST
- Path to your Vivado server (default localhost)Change the values to the appropriate ones (they be found in the Hardware Manager in Vivado GUI) and program the board:
make chs-xil-program VIVADO_MODE=batch XILINX_BOARD=vcu128 XILINX_FLAVOR=flavor\n
"},{"location":"tg/xilinx/#loading-binary-and-debugging-with-openocd","title":"Loading binary and debugging with OpenOCD","text":"Tbd
"},{"location":"tg/xilinx/#running-baremetal-code","title":"Running Baremetal Code","text":"Tbd
"},{"location":"tg/xilinx/#jtag-preloading","title":"JTAG Preloading","text":"Tbd
"},{"location":"tg/xilinx/#booting-linux","title":"Booting Linux","text":"To boot Linux, we must load the OpenSBI firmware, which takes over M mode and launches the U-boot bootloader. U-boot then loads Linux. For more details, see Boot Flow.
Clone the carfield
branch of CVA6 SDK at the root of this repository and build the firmware (OpenSBI + U-boot) and Linux images (this will take about 30 minutes):
git clone https://github.com/pulp-platform/cva6-sdk.git --branch carfield\nmake -C cva6-sdk images\n
In principle, we can boot Linux through JTAG by loading all images into memory, launching OpenSBI, and instructing U-boot to load the kernel directly from memory. Here, we focus on autonomous boot from SD card or SPI flash.
In this case, OpenSBI is loaded by a regular baremetal program called the Zero-Stage Loader (ZSL). The boot ROM loads the ZSL from SD card, which then loads the device tree and firmware from other SD card partitions into memory and launches OpenSBI.
To create a full Linux disk image from the ZSL, device tree, firmware, and Linux, run:
# Place the cva6-sdk where they are expected:\nln -s cva6-sdk/install64 sw/boot/install64\n# Optional: Pre-uild explicitely the image\nmake CAR_ROOT=. sw/boot/linux_carfield_bd_vcu128.gpt.bin\n
You can now recompile the board, it should start booting automatically!
"},{"location":"tg/xilinx/#xilinx-vcu128_1","title":"Xilinx VCU128","text":"This board does not offer a SD card reader. We need to load the image in the integrated flash:
make chs-xil-flash VIVADO_MODE=batch XILINX_BOARD=vcu128 XILINX_FLAVOR=flavor
Use the parameters defined in Using command line (defaults are in target/xilinx/xilinx.mk
) to select your board:
This script will erase your bitstream, once the flash has been written (c.a. 10min) you will need to re-program the bitstream on the board.
"},{"location":"tg/xilinx/#add-your-own-board","title":"Add your own board","text":"If you wish to add a flow for a new FPGA board, please do the following steps: Please consider opening a pull request containing the necessary changes to integrate your new board (:
"},{"location":"tg/xilinx/#makefile","title":"Makefile","text":"Add your board on top of target/xilinx/xilinx.mk
, in particular xilinx_part
and xilinx_board_long
are identifying the FPGA chip and board (can be found in VIvado GUI). The parameters identifying your personal device XILINX_PORT
, XILINX_FPGA_PATH
, XILINX_HOST
can be left empty for now.
// Indicate that you need to debug a signal\n(* dont_touch = \"yes\" *) (* mark_debug = \"true\" *) logic signal_d0;\n// You can also use the following macro from phy_definitions.svh\n`ila(ila_signal_d0, signal_d0)\n
Then, re-build your bitstream.
"},{"location":"tg/xilinx/#re-arametrize-existing-ips","title":"Re-arametrize existing IPs","text":"Carfield's emulation requires a few Vivado IPs to work properly. They are defined and pre-compiled in target/xilinx/xilinx_ips/*
. If you add a new board, you will need to reconfigure your IPs for this board. For instance, to use the Vivado MIG DDR4 controller, modify target/xilinx/xilinx_ips/xlnx_mig_ddr4/run.tcl
. There, add the relevant $::env(XILINX_BOARD)
entry with your configuration. To know which configuration to use your board, you can open a blank project in Vivado GUI, create a blank block design, and instanciate the MIG DDR4 IP there. The Vivado TCL console should write the default parameters for your FPGA. You can later re-configure the IP in the block design and Vivado will print to the tcl console the modified parameters. Then you can copy these tcl lines to the run.tcl
file. Make sure that you added your ip to target/xilinx/flavor_vanilla/flavor_vanilla.mk
under \"xilinx_ips_names_vanilla_your_board\".
If your board require a new IP that has not been integrated already do the following :
target/xilinx/xilinx_ips/[your_ip]
taking the example of the xlnx_mig_ddr4
.target/xilinx/xilinx_ips/[your_ip]/tcl/run.tcl
and target/xilinx/xilinx_ips/[your_ip]/Makefile
accordingly. > - Add your IP to target/flavor_vanilla/flavor_vanilla.mk
under \"xilinx_ips_names_vanilla_your_board\".Connect it's top module in the top-level: target/xilinx/flavor_vanilla/src/cheshire_top_xilinx.sv
. If your IP is a DDR controller, please add it to target/xilinx/src/dram_wrapper_xilinx.sv
. Note that this file contains a pipeline to resize AXI transactions from Cheshire to your controller.
Add the relevant macro parameters to target/xilinx/flavor_vanilla/src/phy_definitions.sv
in order to disable your IP for non-relevant boards.
Each board is defined by a device-tree, when adding a new board, please add a device tree in sw/boot
for each supported flavors.
It is possible to use ILA (Integrated Logic Analyzers) in order to debug some signals on the running FPGA. Add the following before declaring your signals:
"},{"location":"um/","title":"User Manual","text":"The user manual provides detailed reference information on Carfield:
TODO @anga93: add figure
Carfield is organized in domains. As a mixed-criticality system (MCS), each domain serves different purposes in terms of functional safety and reliability, security, and computation capabiities.
Carfield relies on Cheshire as ain host domain, and extends its minimal SoC with additional interconnect ports and interrupts. Hence, several features described in this section can be found
The above block diagram depicts a fully-featured Carfield SoC, which currently provides:
Computing Domain:
Memory Domain:
Mailbox unit
Platform control registers (PCRs)
Interconnect (as in Cheshire):
Interrupts (as in Cheshire):
Peripheral Domain:
This section shows Carfield's memory map. The group Internal to Cheshire
in the table below mirrors the memory map described in the dedicatd documentation for Cheshire and is explicitely shown here for clarity.
0x0000_0000_0000
0x0000_0004_0000
0x04_0000
256 KiB Debug Debug CVA6 0x0000_0004_0000
0x0000_0100_0000
Reserved 0x0000_0100_0000
0x0000_0100_1000
0x00_1000
4 KiB Config AXI DMA Config 0x0000_0100_1000
0x0000_0200_0000
Reserved 0x0000_0200_0000
0x0000_0204_0000
0x04_0000
256 KiB Memory Boot ROM 0x0000_0204_0000
0x0000_0208_0000
0x04_0000
256 KiB Irq CLINT 0x0000_0208_0000
0x0000_020c_0000
0x04_0000
256 KiB Irq IRQ Routing 0x0000_020c_0000
0x0000_0210_0000
0x04_0000
256 KiB Irq AXI-REALM unit 0x0000_020c_0000
0x0000_0300_0000
Reserved 0x0000_0300_0000
0x0000_0300_1000
0x00_1000
4 KiB Config CSRs 0x0000_0300_1000
0x0000_0300_2000
0x00_1000
4 KiB Config LLC 0x0000_0300_2000
0x0000_0300_3000
0x00_1000
4 KiB I/O UART 0x0000_0300_3000
0x0000_0300_4000
0x00_1000
4 KiB I/O I2C 0x0000_0300_4000
0x0000_0300_5000
0x00_1000
4 KiB I/O SPIM 0x0000_0300_5000
0x0000_0300_6000
0x00_1000
4 KiB I/O GPIO 0x0000_0300_6000
0x0000_0300_7000
0x00_1000
4 KiB Config Serial Link 0x0000_0300_7000
0x0000_0300_8000
0x00_1000
4 KiB Config VGA 0x0000_0300_8000
0x0000_0300_A000
0x00_1000
8 KiB Config UNBENT (bus error unit) 0x0000_0300_A000
0x0000_0300_B000
0x00_1000
4 KiB Config Tagger (cache partitioning) 0x0000_0300_8000
0x0000_0400_0000
Reserved 0x0000_0400_0000
0x0000_1000_0000
0x40_0000
64 MiB Irq PLIC 0x0000_0800_0000
0x0000_0C00_0000
0x40_0000
64 MiB Irq CLICs 0x0000_1000_0000
0x0000_1400_0000
0x40_0000
64 MiB Memory LLC Scratchpad 0x0000_1400_0000
0x0000_1800_0000
0x40_0000
64 MiB Memory LLC Scratchpad 0x0000_1800_0000
0x0000_2000_0000
Reserved -------------------------- ------------------------- ------------------ ---------- -------------- ----------------------------------------- External to Cheshire -------------------------- ------------------------- ------------------ ---------- -------------- ----------------------------------------- 0x0000_2000_0000
0x0000_2000_1000
0x00_1000
4 KiB I/O ETHERNET 0x0000_2000_1000
0x0000_2000_2000
0x00_1000
4 KiB I/O CAN BUS 0x0000_2000_2000
0x0000_2000_3000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_3000
0x0000_2000_4000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_4000
0x0000_2000_5000
0x00_1000
4 KiB I/O GP TIMER 1 (System timer) 0x0000_2000_5000
0x0000_2000_6000
0x00_1000
4 KiB I/O GP TIMER 2 (Advanced timer) 0x0000_2000_6000
0x0000_2000_7000
0x00_1000
4 KiB I/O GP TIMER 3 0x0000_2000_7000
0x0000_2000_8000
0x00_1000
4 KiB I/O WATCHDOG timer 0x0000_2000_8000
0x0000_2000_9000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_9000
0x0000_2000_a000
0x00_1000
4 KiB I/O HyperBUS 0x0000_2000_a000
0x0000_2000_b000
0x00_1000
4 KiB I/O Pad Config 0x0000_2000_b000
0x0000_2000_c000
0x00_1000
4 KiB I/O L2 ECC Config 0x0000_2001_0000
0x0000_2001_1000
0x00_1000
4 KiB I/O Carfield Control and Status 0x0000_2002_0000
0x0000_2002_1000
0x00_1000
4 KiB I/O PLL/CLOCK 0x0000_2800_1000
0x0000_4000_0000
Reserved 0x0000_4000_0000
0x0000_4000_1000
0x00_1000
4 KiB Irq Mailboxes 0x0000_4000_1000
0x0000_5000_0000
Reserved 0x0000_5000_0000
0x0000_5080_0000
0x80_0000
8 MiB Accelerators Integer Cluster 0x0000_5080_0000
0x0000_5100_0000
Reserved 0x0000_5100_0000
0x0000_5180_0000
0x80_0000
8 MiB Accelerators FP Cluster 0x0000_5100_0000
0x0000_6000_0000
Reserved 0x0000_6000_0000
0x0000_6002_0000
0x02_0000
128 KiB Safe domain Safety Island Memory 0x0000_6002_0000
0x0000_6020_0000
0x1e_0000
Safe domain reserved 0x0000_6020_0000
0x0000_6030_0000
0x10_0000
Safe domain Safety Island Peripherals 0x0000_6030_0000
0x0000_6080_0000
0x50_0000
Safe domain reserved 0x0000_6080_0000
0x0000_7000_0000
Reserved 0x0000_7000_0000
0x0000_7002_0000
0x02_0000
128 KiB Memory LLC Scratchpad 0x0000_7800_0000
0x0000_7810_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 1, interleaved) 0x0000_7810_0000
0x0000_7820_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 1, non-interleaved) 0x0000_7820_0000
0x0000_7830_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 2, interleaved) 0x0000_7830_0000
0x0000_7840_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 2, non-interleaved) 0x0000_8000_0000
0x0020_8000_0000
0x20_0000_0000
128 GiB Memory LLC/DRAM"},{"location":"um/arch/#interrupt-map","title":"Interrupt map","text":"Carfield's interrupt components are exhaustivly described in the dedicated section of the documentation for Cheshire. This section describes Carfield's interrupt map.
Interrupt Source Interrupt sink Bitwidth Connection Type Comment Carfield peripherals --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- intr_wkup_timer_expired_o 1 car_wdt_intrs[0] level-sensitive intr_wdog_timer_bark_o 1 car_wdt_intrs[1] level-sensitive nmi_wdog_timer_bark_o 1 car_wdt_intrs[2] level-sensitive wkup_req_o 1 car_wdt_intrs[3] level-sensitive aon_timer_rst_req_o 1 car_wdt_intrs[4] level-sensitive irq 1 car_can_intr level-sensitive ch_0_o[0] 1 car_adv_timer_ch0 edge-sensitive ch_0_o[1] 1 car_adv_timer_ch1 edge-sensitive ch_0_o[2] 1 car_adv_timer_ch2 edge-sensitive ch_0_o[3] 1 car_adv_timer_ch3 edge-sensitive events_o[0] 1 car_adv_timer_events[0] edge-sensitive events_o[1] 1 car_adv_timer_events[1] edge-sensitive events_o[2] 1 car_adv_timer_events[2] edge-sensitive events_o[3] 1 car_adv_timer_events[3] edge-sensitive irq_lo_o 1 car_sys_timer_lo edge-sensitive irq_hi_o 1 car_sys_timer_hi edge-sensitive --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Cheshire peripherals --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- zero 1 zero level-sensitive uart 1 uart level-sensitive i2c_fmt_threshold 1 i2c_fmt_threshold level-sensitive i2c_rx_threshold 1 i2c_rx_threshold level-sensitive i2c_fmt_overflow 1 i2c_fmt_overflow level-sensitive i2c_rx_overflow 1 i2c_rx_overflow level-sensitive i2c_nak 1 i2c_nak level-sensitive i2c_scl_interference 1 i2c_scl_interference level-sensitive i2c_sda_interference 1 i2c_sda_interference level-sensitive i2c_stretch_timeout 1 i2c_stretch_timeout level-sensitive i2c_sda_unstable 1 i2c_sda_unstable level-sensitive i2c_cmd_complete 1 i2c_cmd_complete level-sensitive i2c_tx_stretch 1 i2c_tx_stretch level-sensitive i2c_tx_overflow 1 i2c_tx_overflow level-sensitive i2c_acq_full 1 i2c_acq_full level-sensitive i2c_unexp_stop 1 i2c_unexp_stop level-sensitive i2c_host_timeout 1 i2c_host_timeout level-sensitive spih_error 1 spih_error level-sensitive spih_spi_event 1 spih_spi_event level-sensitive gpio 32 gpio level-sensitive --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Spatz cluster --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- msip_i[0] 1 (hostd_spatzcl_mb_intr_ored[0] | safed_spatzcl_intr_mb[0]) level-sensitive Snitch core #0 msip_i[1] 1 (hostd_spatzcl_mb_intr_ored[1] | safed_spatzcl_intr_mb[1]) level-sensitive Snitch core #1 mtip_i[0] 1 chs_mti[0] level-sensitive Snitch core #0 mtip_i[1] 1 chs_mti[1] level-sensitive Snitch core #1 meip_i 2 - unconnected seip_i 2 - unconnected --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- HRM integer cluster --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- eoc_o 1 pulpcl_eoc level-sensitive mbox_irq_i 1 (hostd_pulpcl_mb_intr_ored | safed_pulpcl_intr_mb) level-sensitive to offload binaries --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Secure domain --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- irq_ibex_i 1 (hostd_secd_mb_intr_ored | safed_secd_intr_mb) level-sensitive to wake-up Ibex core --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Safe domain --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- irqs_i[0] 1 hostd_safed_mbox_intr[0] level-sensitive from host domain CVA6#0 irqs_i[1] 1 hostd_safed_mbox_intr[1] level-sensitive from host domain CVA6#1 irqs_i[2] 1 secd_safed_mbox_intr level-sensitive from secure domain irqs_i[3] 1 pulpcl_safed_mbox_intr level-sensitive from HMR custer irqs_i[4] 1 spatzcl_safed_mbox_intr level-sensitive from vectorial cluster irqs[5] 1 irqs_distributed_249 level-sensitive tied to 0 irqs[6] 1 irqs_distributed_250 level-sensitive host domain UART irqs[7] 1 irqs_distributed_251 level-sensitive i2c_fmt_threshold irqs[8] 1 irqs_distributed_252 level-sensitive i2c_rx_threshold irqs[9] 1 irqs_distributed_253 level-sensitive i2c_fmt_overview irqs[10] 1 irqs_distributed_254 level-sensitive i2c_rx_overflow irqs[11] 1 irqs_distributed_255 level-sensitive i2c_nak irqs[12] 1 irqs_distributed_256 level-sensitive i2c_scl_interference irqs[13] 1 irqs_distributed_257 level-sensitive i2c_sda_interference irqs[14] 1 irqs_distributed_258 level-sensitive i2c_stret h_timeout irqs[15] 1 irqs_distributed_259 level-sensitive i2c_sda_unstable irqs[16] 1 irqs_distributed_260 level-sensitive i2c_cmd_complete irqs[17] 1 irqs_distributed_261 level-sensitive i2c_tx_stretch irqs[18] 1 irqs_distributed_262 level-sensitive i2c_tx_overflow irqs[19] 1 irqs_distributed_263 level-sensitive i2c_acq_full irqs[20] 1 irqs_distributed_264 level-sensitive i2c_unexp_stop irqs[21] 1 irqs_distributed_265 level-sensitive i2c_host_timeout irqs[22] 1 irqs_distributed_266 level-sensitive spih_error irqs[23] 1 irqs_distributed_267 level-sensitive spih_spi_event irqs[55:24] 32 irqs_distributed_299:268 level-sensitive gpio irqs_i[56] 1 irqs_distributed_300 level-sensitive pulpcl_eoc irqs_i[57] 1 irqs_distributed_309 level-sensitive car_wdt_intrs[0] irqs_i[58] 1 irqs_distributed_310 level-sensitive car_wdt_intrs[1] irqs_i[59] 1 irqs_distributed_311 level-sensitive car_wdt_intrs[2] irqs_i[60] 1 irqs_distributed_312 level-sensitive car_wdt_intrs[3] irqs_i[61] 1 irqs_distributed_313 level-sensitive car_wdt_intrs[4] irqs_i[62] 1 irqs_distributed_314 level-sensitive car_can_intr irqs_i[63] 1 irqs_distributed_315 edge-sensitive car_adv_timer_ch0 irqs_i[64] 1 irqs_distributed_316 edge-sensitive car_adv_timer_ch1 irqs_i[65] 1 irqs_distributed_317 edge-sensitive car_adv_timer_ch2 irqs_i[66] 1 irqs_distributed_318 edge-sensitive car_adv_timer_ch3 irqs_i[67] 1 irqs_distributed_319 edge-sensitive car_adv_timer_events[0] irqs_i[68] 1 irqs_distributed_320 edge-sensitive car_adv_timer_events[1] irqs_i[69] 1 irqs_distributed_321 edge-sensitive car_adv_timer_events[2] irqs_i[70] 1 irqs_distributed_322 edge-sensitive car_adv_timer_events[0] irqs_i[71] 1 irqs_distributed_323 edge-sensitive car_sys_timer_lo irqs_i[72] 1 irqs_distributed_324 edge-sensitive car_sys_timer_hi irqs_i[127:73] 54 irqs_distributed_331:325 - tied to 0 --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Cheshire --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- intr_ext_i[0] 1 pulpcl_eoc level-sensitive from HMR cluster intr_ext_i[2:1] 2 pulpcl_hostd_mbox_intr level-sensitive from HMR cluster intr_ext_i[4:3] 2 spatzcl_hostd_mbox_intr level-sensitive from vectorial cluster intr_ext_i[6:5] 2 safed_hostd_mbox_intr level-sensitive from safe domain intr_ext_i[8:7] 2 secd_hostd_mbox_intr level-sensitive from secure domain intr_ext_i[9] 1 car_wdt_intrs[0] level-sensitive from carfield peripherals intr_ext_i[10] 1 car_wdt_intrs[1] level-sensitive from carfield peripherals intr_ext_i[11] 1 car_wdt_intrs[2] level-sensitive from carfield peripherals intr_ext_i[12] 1 car_wdt_intrs[3] level-sensitive from carfield peripherals intr_ext_i[13] 1 car_wdt_intrs[4] level-sensitive from carfield peripherals intr_ext_i[14] 1 car_can_intr level-sensitive from carfield peripherals intr_ext_i[15] 1 car_adv_timer_ch0 edge-sensitive from carfield peripherals intr_ext_i[16] 1 car_adv_timer_ch1 edge-sensitive from carfield peripherals intr_ext_i[17] 1 car_adv_timer_ch2 edge-sensitive from carfield peripherals intr_ext_i[18] 1 car_adv_timer_ch3 edge-sensitive from carfield peripherals intr_ext_i[19] 1 car_adv_timer_events[0] edge-sensitive from carfield peripherals intr_ext_i[20] 1 car_adv_timer_events[1] edge-sensitive from carfield peripherals intr_ext_i[21] 1 car_adv_timer_events[2] edge-sensitive from carfield peripherals intr_ext_i[22] 1 car_adv_timer_events[3] edge-sensitive from carfield peripherals intr_ext_i[23] 1 car_sys_timer_lo edge-sensitive from carfield peripherals intr_ext_i[24] 1 car_sys_timer_hi edge-sensitive from carfield peripherals intr_ext_i[31:25] 7 0 tied to 0 meip_ext_o[0] - level-sensitive unconnected meip_ext_o[1] - level-sensitive unconnected meip_ext_o[2] - level-sensitive unconnected seip_ext_o[0] - level-sensitive unconnected seip_ext_o[1] - level-sensitive unconnected seip_ext_o[2] - level-sensitive unconnected msip_ext_o[0] - level-sensitive unconnected msip_ext_o[1] - level-sensitive unconnected msip_ext_o[2] - level-sensitive unconnected mtip_ext_o[0] - level-sensitive Snitch core #0 mtip_ext_o[1] - level-sensitive Snitch core #1 mtip_ext_o[2] - level-sensitive unconnected"},{"location":"um/arch/#domains","title":"Domains","text":"We divide Carfield domains in two macro groups, the Computing Domain and the Memory Domain. They are both fragmented into smaller domains, described in the following two sections.
The total number of domains is 7 (computing: host domain, safe domain, secure domain, integer PMCA domain, vectorial PMCA domain, peripheral domain, memory: dynamic SPM domain).
Note for the reader
Carfield's domains live in dedicated repositories. We invite the reader to consult the documentation of each domain for more information. Below, we focus on integration parameterization within Carfield.
"},{"location":"um/arch/#computing-domain","title":"Computing Domain","text":""},{"location":"um/arch/#host-domain-cheshire","title":"Host domain (Cheshire)","text":"The host domain (Cheshire) embeds all the necessary components required to run OSs such as embedded Linux. It has two orthogonal operation modes.
Untrusted mode: in this operation mode, the host domain is tasked to run untrusted services, i.e. non time- and safety-critical applications. For example, this could be the case of infotainment on a modern car. In this mode, as in traditional automotive platforms, safety and resiliency features are deferred to a dedicated 32-bit microcontroller-like system, called safe domain
in Carfield.
Hybrid trusted/untrusted mode: in this operation mode, the host domain is in charge of both critical and non-critical applications. Key features to achieve this are:
physical tagger
in front of the cores to mark partitions by acting directly on the physical address spaceHybrid operation mode is currently experimental, and mostly for research purposes. We advise of relying on a combination of host ad safe domain for a more traditional approach.
Cheshire is configured as follows:
CarfieldNumExtIntrs
), see Interrupt map. Unused are tied to 0 (currently 9/32)CarfieldNumInterruptibleHarts
). The interruptible harts are Snitch core #0 and #1 in the vectorial cluster.CarfieldNUmRouterTargets
), tasked to distribute N input interrupts to M targets. In Carfield, the external target is the safe domain
.The safe domain is a simple MCU-like domain that comprises three 32-bit real-time CV32E40P (CV32RT) RISC-V cores operating in triple-lockstep mode (TCLS).
These cores, enhanced with the RISC-V CLIC controller and optimized for fast interrupt handling and context switch, run RTOSs and safety-critical applications, embodying a core tenet of the platform reliability.
The safe domain is essential when the host domain is operated in untrusted mode.
The safe domain is configured as follows:
The secure domain, based on the OpenTitan project, serves as the Hardware Root-of-Trust (HWRoT) of the platform. It handles secure boot and system integrity monitoring fully in HW through cryptographic acceleration services.
Compared to vanilla OpenTitan, the secure domain integrated in Carfield is modified/configured as follows:
TODO
TODO Mention SECURE BOOT
mode
To augment computational capabilities, Carfield incorporates two general-purpose accelerators
"},{"location":"um/arch/#hmr-integer-pmca","title":"HMR integer PMCA","text":"The hybrid modular redundancy (HMR) integer PMCA is specialized in executing reliable boosted Quantized Neural Network (QNN) operations, exploiting the HMR technique for rapid fault recovery and integer arithmetic support in the ISA of the RISC-V cores from 32-bit down to 2-bit and mixed-precision formats.
The HMR integer PMCA is configured as follows:
TODO
"},{"location":"um/arch/#vectorial-pmca","title":"Vectorial PMCA","text":"The vectorial PMCA, or Spatz PMCA handles vectorizable multi-format floating-point workloads (down to FP8).
The Spatz PMCA is configured as follows:
TODO
"},{"location":"um/arch/#memory-domain","title":"Memory Domain","text":""},{"location":"um/arch/#dynamic-scratchpad-memory-spm","title":"Dynamic scratchpad memory (SPM)","text":"The dynamic SPM features dynamically switching address mapping policy. It manages the following features:
Carfield integrates a in-house, open-source implementation of Infineon' HyperBus off-chip link to connect to external HyperRAM modules.
Despite describing it as part of the Memory Domain, the HyperBus is logically part of the peripheral domain.
It manages the following features:
The interconnect is composed of a main AXI4 matrix (or crossbar) with AXI5 atomic operations (ATOPs) support. The crossbar extends Cheshire's with additional external AXI manager and subordinate ports.
Cheshire's auxiliary Regbus demultiplexer is extended with additional peripheral configuration ports for external PLL/FLL and padmux configuration, which are specific of ASIC wrappers.
An additional peripheral subsystem based on APB hosts Carfield-specific peripherals.
"},{"location":"um/arch/#mailbox-unit","title":"Mailbox unit","text":"The mailbox unit consists in a number of configurable mailboxes. Each mailbox is the preferred communication vehicle between domains. It can be used to wake-up certain domains, notify an offloader (e.g., Cheshire) that a target device (e.g., the integer PMCA) has reached execution completion, dispatch entry points to a target device to jump-start its execution, and many others.
It manages the following features:
Assuming each mailbox is identified with id i
, the register file map reads:
The above register map can be found in the dedicated repository and is reported here for convenience.
TODO @alex96295: Add figure
"},{"location":"um/arch/#platform-control-registers","title":"Platform control registers","text":"PCRs provide basic system information, and control clock, reset and other functionalities of Carfield's domains.
A more detailed overview of each PCR (register subfields and description) can be found here. PCR base address is listed in the Memory Map as for the other devices.
Name Offset Length DescriptionVERSION0
0x0
4 Cheshire sha256 commit VERSION1
0x4
4 Safety Island sha256 commit VERSION2
0x8
4 Security Island sha256 commit VERSION3
0xc
4 PULP Cluster sha256 commit VERSION4
0x10
4 Spatz CLuster sha256 commit JEDEC_IDCODE
0x14
4 JEDEC ID CODE GENERIC_SCRATCH0
0x18
4 Scratch GENERIC_SCRATCH1
0x1c
4 Scratch HOST_RST
0x20
4 Host Domain reset -active high, inverted in HW- PERIPH_RST
0x24
4 Periph Domain reset -active high, inverted in HW- SAFETY_ISLAND_RST
0x28
4 Safety Island reset -active high, inverted in HW- SECURITY_ISLAND_RST
0x2c
4 Security Island reset -active high, inverted in HW- PULP_CLUSTER_RST
0x30
4 PULP Cluster reset -active high, inverted in HW- SPATZ_CLUSTER_RST
0x34
4 Spatz Cluster reset -active high, inverted in HW- L2_RST
0x38
4 L2 reset -active high, inverted in HW- PERIPH_ISOLATE
0x3c
4 Periph Domain AXI isolate SAFETY_ISLAND_ISOLATE
0x40
4 Safety Island AXI isolate SECURITY_ISLAND_ISOLATE
0x44
4 Security Island AXI isolate PULP_CLUSTER_ISOLATE
0x48
4 PULP Cluster AXI isolate SPATZ_CLUSTER_ISOLATE
0x4c
4 Spatz Cluster AXI isolate L2_ISOLATE
0x50
4 L2 AXI isolate PERIPH_ISOLATE_STATUS
0x54
4 Periph Domain AXI isolate status SAFETY_ISLAND_ISOLATE_STATUS
0x58
4 Safety Island AXI isolate status SECURITY_ISLAND_ISOLATE_STATUS
0x5c
4 Security Island AXI isolate status PULP_CLUSTER_ISOLATE_STATUS
0x60
4 PULP Cluster AXI isolate status SPATZ_CLUSTER_ISOLATE_STATUS
0x64
4 Spatz Cluster AXI isolate status L2_ISOLATE_STATUS
0x68
4 L2 AXI isolate status PERIPH_CLK_EN
0x6c
4 Periph Domain clk gate enable SAFETY_ISLAND_CLK_EN
0x70
4 Safety Island clk gate enable SECURITY_ISLAND_CLK_EN
0x74
4 Security Island clk gate enable PULP_CLUSTER_CLK_EN
0x78
4 PULP Cluster clk gate enable SPATZ_CLUSTER_CLK_EN
0x7c
4 Spatz Cluster clk gate enable L2_CLK_EN
0x80
4 Shared L2 memory clk gate enable PERIPH_CLK_SEL
0x84
4 Periph Domain pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SAFETY_ISLAND_CLK_SEL
0x88
4 Safety Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SECURITY_ISLAND_CLK_SEL
0x8c
4 Security Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) PULP_CLUSTER_CLK_SEL
0x90
4 PULP Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SPATZ_CLUSTER_CLK_SEL
0x94
4 Spatz Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) L2_CLK_SEL
0x98
4 L2 Memory pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) PERIPH_CLK_DIV_VALUE
0x9c
4 Periph Domain clk divider value SAFETY_ISLAND_CLK_DIV_VALUE
0xa0
4 Safety Island clk divider value SECURITY_ISLAND_CLK_DIV_VALUE
0xa4
4 Security Island clk divider value PULP_CLUSTER_CLK_DIV_VALUE
0xa8
4 PULP Cluster clk divider value SPATZ_CLUSTER_CLK_DIV_VALUE
0xac
4 Spatz Cluster clk divider value L2_CLK_DIV_VALUE
0xb0
4 L2 Memory clk divider value HOST_FETCH_ENABLE
0xb4
4 Host Domain fetch enable SAFETY_ISLAND_FETCH_ENABLE
0xb8
4 Safety Island fetch enable SECURITY_ISLAND_FETCH_ENABLE
0xbc
4 Security Island fetch enable PULP_CLUSTER_FETCH_ENABLE
0xc0
4 PULP Cluster fetch enable SPATZ_CLUSTER_DEBUG_REQ
0xc4
4 Spatz Cluster debug req HOST_BOOT_ADDR
0xc8
4 Host boot address SAFETY_ISLAND_BOOT_ADDR
0xcc
4 Safety Island boot address SECURITY_ISLAND_BOOT_ADDR
0xd0
4 Security Island boot address PULP_CLUSTER_BOOT_ADDR
0xd4
4 PULP Cluster boot address SPATZ_CLUSTER_BOOT_ADDR
0xd8
4 Spatz Cluster boot address PULP_CLUSTER_BOOT_ENABLE
0xdc
4 PULP Cluster boot enable SPATZ_CLUSTER_BUSY
0xe0
4 Spatz Cluster busy PULP_CLUSTER_BUSY
0xe4
4 PULP Cluster busy PULP_CLUSTER_EOC
0xe8
4 PULP Cluster end of computation ETH_RGMII_PHY_CLK_DIV_EN
0xec
4 Ethernet RGMII PHY clock divider enable bit ETH_RGMII_PHY_CLK_DIV_VALUE
0xf0
4 Ethernet RGMII PHY clock divider value ETH_MDIO_CLK_DIV_EN
0xf4
4 Ethernet MDIO clock divider enable bit ETH_MDIO_CLK_DIV_VALUE
0xf8
4 Ethernet MDIO clock divider value"},{"location":"um/arch/#peripheral-domain","title":"Peripheral Domain","text":"Carfield enhances Cheshire's peripheral subsystem with additional capabilities.
An external AXI manager port is attached to the matrix crossbar. The 64-bit data, 48-bit address AXI protocol is converted to the slower, 32-bit data and address APB protocol. An APB demultiplexer allows attaching several peripherals, described below.
"},{"location":"um/arch/#generic-and-advanced-timer","title":"Generic and advanced timer","text":"Carfield integrates a generic timer and an advanced timer.
The generic timer manages the following features:
For more information, read the dedicated documentation.
The advanced timer manages the following features:
For more information, read the dedicated documentation.
"},{"location":"um/arch/#watchdog-timer","title":"Watchdog timer","text":"We employ the watchdog timer developed by the OpenTitan project project. It manages the following features:
For more information, read the dedicated documentation.
"},{"location":"um/arch/#can","title":"CAN","text":"We employ a CAN device developed by the Czech Technical University in Prague. It manages the following features:
For more information, read the dedicated documentation
"},{"location":"um/arch/#ethernet","title":"Ethernet","text":"We employ Ethernet IPs developed by Alex Forencich and assemble them with a high-performant DMA, the same used in Cheshire.
We use Reduced gigabit media-independent interface (RGMII) that supports speed up to 1000Mbit/s (1GHz).
For more information, read the dedicated documentation of Ethernet components from its original repository.
"},{"location":"um/arch/#clock-and-reset","title":"Clock and reset","text":"The two figures above show the clock, reset and isolation distribution for a domain X
in Carfield, and their relationship. A more detailed description is provided below.
Carfield is provided with 3 clocks sources. They can be fully asynchronous and not bound to any phase relationship, since dual-clock FIFOs are placed between domains to allow clock domain crossing (CDC):
host_clk_i
: preferably, clock of the host domainalt_clk_i
: preferably, clock of alternate domains, namely safe domain, secure domain, accelerator domainper_clk_i
: preferably, clock of peripheral domainIn addition, a real-time clock (RTC, rt_clk_i
) is provided externally, at crystal frequency (32kHz) or higher.
These clocks are supplied externally, by a dedicated PLL per clock source or by a single PLL that supplies all three clock sources. The configuration of the clock source can be handled by the external PLL wrapper configuration registers, e.g. in a ASIC top level
Regardless of the specific name used for the clock signals in HW, Carfield has a flexible clock distribution that allows each of the 3 clock sources to be assigned to a domain, as explained below.
As the top figure shows, out of the 7 domains described in Domains, 6 can be clock gated and isolated: safe domain, secure domain, accelerator domain, peripheral domain, dynamic SPM.
When isolation for a domain X
is enabled, data transfers towards a domain are terminated and never reach it. To achieve this, an AXI4 compliant isolation module is placed in front of each domain. The bottom figure shows in detail the architecture of the isolation scheme between the host domain and a generic X
domain, highlighting its relationship with the domain's reset and cloc signals.
For each of the 6 clock gateable domains, the following clock distribution scheme applies:
HW resources for the clock distribution (steps 1., 2., and 3.) and isolation of a domain X
, are SW-controlled via dedicated PCRs. Refer to Platform Control Registers in this page for more information.
The only domain that is always-on and de-isolated is the host domain (Cheshire). If required, clock gating and/or isolation of it can be handled at higher levels of hierarchy, e.g. in a dedicated ASIC wrapper.
"},{"location":"um/arch/#startup-behavior-after-power-on-reset-por","title":"Startup behavior after Power-on reset (POR)","text":"The user can decide whether secure boot must be performed on the executing code before runtime. If so, the secure domain must be active after POR, i.e., clocked and de-isolated. This behavior is regulated by the input pin secure_boot_i
according to the following table:
secure_boot_i
Secure Boot System status after POR 0
OFF
secure domain gated and isolated as the other 5 domains, host domain always-on and idle 1
ON
host domain always-on and idle, secure domain active, takes over secure boot and can't be warm reset-ed; other 5 domains gated and isolated Regardless of the value of secure_boot_i
, since by default some domains are clock gated and isolated after POR, SW or external physical interfaces (JTAG/Serial Link) must handle their wake-up process. Routines are provided in the Software Stack.
Carfield is provided with one POR (active-low), pwr_on_rst_ni
, responsible for the platform's cold reset.
The POR is synchronized with the clock of each domain, user-selected as explained above, and propagated to the domain.
In addition, a warm reset can be initiated from SW through the PCRs for each domain. Exceptions to this are the host domain (always-on), and the secure domain when secure_boot_i
is asserted.
Carfield's Software Stack is provided in the sw/
folder, organized as follows:
sw\n\u251c\u2500\u2500 boot\n\u251c\u2500\u2500 include\n\u251c\u2500\u2500 lib\n\u251c\u2500\u2500 link\n\u251c\u2500\u2500 sw.mk\n\u251c\u2500\u2500 tests\n \u00a0\u00a0 \u251c\u2500\u2500 bare-metal\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 hostd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 pulpd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 safed\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 secd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 spatzd\n \u00a0\u00a0 \u2514\u2500\u2500 linux\n
Employing Cheshire as host domain, Carfield's software stack is largely based on, and built on top of, Cheshire's.
This means that it shares the same:
Therefore, we defer the reader to Cheshire's Software Stack description for more information.
Programs compiled for Carfield that requiree additional, Carfield-specific drivers (for domains' offload, peripheral control, etc) are linked against Cheshire's static library (libcheshire.a
). This operation is transparent to the programmer, that can take advantage of Cheshire's device drivers and SW routines within Carfield seamlessly.
Provided the equivalence and reuse between Carfield and Cheshire, in this page we focus on Carfield-specific SW components and build flow, with an emphasis on domains different than Cheshire.
"},{"location":"um/sw/#compiler-requirements","title":"Compiler requirements","text":"General-purpose processing elements (PEs) integrated in Carfield implement the RISC-V ISA, targeting either RV64 (host domain) or RV32 (all the others: safe domain, secure domain, integer PMCA, and vectorial PMCA).
To build programs written in plain C for a Carfield domain with the base ISA and its regular extensions (namely, RV64G
and RV32IMACF
) without using custom extensions that each domain provide, you simply need vanilla RV64 and RV32 compilers.
Otherwise, to use custom instruction supported in HW for a domain, specific compiler support is required. We are working to improve compiler support by providing pointers to pre-built releases or a container-based build flow.
"},{"location":"um/sw/#boot-flow-and-secure-boot","title":"Boot Flow and Secure Boot","text":"Carfield supports two operative boot flows:
Non-secure: being an always-on domain, in this operative boot flow Cheshire takes over Carfield's boot flow. This means that passive and autonomous boot are equivalent to those described in Cheshire's Software Stack. Since the other domains are clock gated, SW to be executed on them requires Cheshire to handle their wake-up sequence.
Secure: The secure domain performs the secure boot process on the code that will be executed on the Carfield system, independently of the domain. For more information, read the dedicated secure boot documentation of the OpenTitan project.
BMPs for all domains can be built from the root of Carfield through a portable make fragment sw.mk
located in the sw/
folder.
To simplify each domain SW build as much as possible, we provide a make fragment located at sw/tests/bare-metal/<domain>/sw.mk
, included in the main sw.mk
.
BMPs for each domain are compiled in situ in the domain repository, since each IP was design for, or supports also, standalone execution and has its own build flow.
The global command
make car-sw-build\n
builds program binaries in ELF format for each domain, which can be used with the simulation methods supported by the platform, as described in Simulation or on FPGA as described in Xilinx FPGAs.
As in Cheshire, Carfield programs can be created to be executed from several memory locations:
l2
): the linkerscript is provided in Carfield's sw/link/
folder, since Dynamic SPM is not integrated in the minimal Cheshirespm
): valid when the LLC is configured as such. In Carfield, half of the LLC is configured as SPM from the boot ROM during system bringup, as this is the default behavior in Cheshire.dram
): the HyperRAMFor example, to build a specific BMP (here sw/tests/bare-metal/hostd/helloworld.c
to be run on Cheshire) executing from the Dynamic SPM, run:
make sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
To create the same program executing from DRAM, sw/tests/bare-metal/hostd/helloworld.car.dram.elf
can instead be built from the same source. Depending on their assumptions and behavior, not all programs may be built to execute from both locations.
When executing host domain programs in Linux (on FPGA/ASIC targets) that require access to memory mapped components of other domains, SW intervention is needed to map virtual to physical addresses, since domains different than the host currently lack support for HW-based virtual memory translation.
In the current SW stack, this mapping is already provided and hence transparent to the user. Test programs targeting Linux that require it are located in different folder, sw/tests/linux/<domain>
.
Offload of programs to Carfield domains involves:
Programs can be offloaded with:
Simple baremetal offload (BMO), recommended for regression tests use cases that are simple enough to be executed with cycle-accurate RTL simulations. For instance, this can be the case of dynamic timing analysis (DTA) carried out during an ASIC development cycle.
The OpenMP API, recommended when developing SW for Carfield on a FPGA or, eventually, ASIC implementing Carfield, because of the ready-to-use OS support (currently, Linux). Usage of the OpenMP API with non OS-directed (baremetal) SW can be supported, but is mostly suited for heterogeneous embedded systems with highly constrained resources
In the following, we briefly describe both.
Note for the reader
Since by default all domains are clock gated and isolated after POR except for the host domain (Cheshire), as described in Architecture, the wake-up process must be handled from the C source code.
"},{"location":"um/sw/#baremetal-offload","title":"Baremetal offload","text":"For BMO, the offloader takes care of bootstrapping the target device ELF in the correct memory location, initializing the target and launching its execution through a simple ELF Loader. The ELF Loader source code is located in the offloader's SW directory, and follows a naming convention:
<target_device>_offloader_<blocking | non_blocking>.c \n
The target device's ELF is included into the offloader's ELF Loader as a header file. The target device's ELF sections are first pre-processed offline to extract instruction addresses.The resulting header file provides the ELF loading process at the selected memory location. The loading process can be carried out by the offloader as R/W sequences, or deferred to a DMA-driven memcopy. In addition, the offloader takes care of bootstrapping the target device, i.e. initializing it and launching its execution.
Upon target device completion, the offloader:
Currently, blocking BMO is implemented.
As an example, assume the host domain as offloader and the integer PMCA as target device.
sw/tests/bare-metal/hostd
sw/tests/bare-metal/pulpd
The resulting offloader ELF's name reads:
<target_device>_offloader_<blocking | non_blocking>.<target_device_test_name>.car.<l2 | spm | dram>.elf\n
According to the memory location where the BMP will be executed.
The final offloader ELF can be preloaded with simulation methods described in the Simulation section, and can be built again as explained above.
Note for the reader
BMO is in general not recommended for developing SW for Carfield, as it was introduced during ASIC development cycle and can be an effective litmus test to find and fix HW bugs, or during DTA.
For SW development on Carfield and in particular domain-driven offload, it is recommended to use OpenMP offload on FPGA/ASIC, described below.
"},{"location":"um/sw/#openmp-offload-recommended-use-on-fpgaasic","title":"OpenMP offload (recommended: use on FPGA/ASIC)","text":"TODO Cyril
"},{"location":"um/sw/#external-benchmarks","title":"External benchmarks","text":"We support several external benchmarks, whose build flow has been slightly adapted to align with Carfield's. Currently, they are:
Carfield is a mixed-criticality SoC targeting the automotive domain. It uses Cheshire
as main host domain.
Carfield is developed as part of the PULP project, a joint effort between ETH Zurich and the University of Bologna.
"},{"location":"#motivation","title":"Motivation","text":"The rapid evolution of AI algorithms, the massive amount of sensed data and the pervasive influence of AI-enhanced applications across application-domains such as Automotive, Space and Cyber-Physical embedded systems (CPSs), call for a paradigm shift from simple micro-controllers towards powerful and heterogeneous edge computers in the design of next generation of mixed-criticality systems (MCSs). These must not only deliver outstanding performance and energy efficiency but also ensure steadfast safety, resilience, and security.
The Carfield platform aims to tackle these architectural challenges establishing itself as a pre-competitive heterogeneous platform for MCSs, underpinned by fully open-source Intellectual Properties (IPs). Carfield showcases pioneering hardware solutions, addressing challenges related to time-predictable on/off-chip communication, robust fault recovery mechanisms, secure boot processes, cryptographic acceleration services, hardware-assisted virtualization, and accelerated computation for both floating-point and integer workloads.
"},{"location":"#quick-start","title":"Quick Start","text":"If you are impatient and have all needed dependencies, you can run:
make car-all\n
and then run a simulation by typing:
make car-hw-build car-hw-sim CHS_BINARY=./sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
"},{"location":"#license","title":"License","text":"Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All hardware sources and tool scripts are licensed under the Solderpad Hardware License 0.51 (see LICENSE
) with the exception of generated register file code (e.g. hw/regs/*.sv
), which is generated by a fork of lowRISC's regtool
and licensed under Apache 2.0. All software sources are licensed under Apache 2.0.
We first discuss the Carfield's project structure, its dependencies, and how to build it.
"},{"location":"gs/#repository-structure","title":"Repository structure","text":"The project is structured as follows:
Directory Description Documentationdoc
Documentation Home hw
Hardware sources as SystemVerilog RTL Architecture sw
Software stack, build setup, and tests Software Stack target
Simulation, FPGA, and ASIC target setups Targets utils
Utility scripts scripts
Some helper scripts for env setup"},{"location":"gs/#dependencies","title":"Dependencies","text":"To build Carfield, you will need:
>= 3.82
>= 3.9
>= 0.27.1
>= 11.2.0
requirements.txt
We use Bender for hardware IP and dependency management; for more information on using Bender, please see its documentation. You can install Bender directly through the Rust package manager Cargo:
cargo install bender\n
Depending on your desired target, additional dependencies may be needed.
"},{"location":"gs/#building-carfield","title":"Building Carfield","text":"To build different parts of Carfield, the carfield.mk
run make
followed by these targets:
car-hw-init
: generated hardware, including IPs and boot ROMcar-sim-init
(\u2020): scripts and external models for simulationcar-sw-build
(\u2021): bare-metal software running on the hardware\u2020 car-sim-init
will download externally provided peripheral simulation models, some proprietary and with non-free license terms, from their publically accessible sources. By running car-sim-init
, you accept this.
\u2021 car-sw-build
requires RV64 and RV32 toolchains. See the Software Stack for more details.
To run all build targets above (\u2020)(\u2021):
make car-init\n
Running car-init
is required at least once to correctly configure IPs we depend on. On reconfiguring any generated hardware or changing IP versions, car-init
should be rerun.
The following additional targets are not invoked by the above, but also available:
chs-bootrom-all
- rebuilds Cheshire's boot ROM. This is not done by default as reproducible builds (as checked by CI) can only be guaranteed for fixed compiler versions.car-nonfree-init
- clones our internal repository with nonfree resources we cannot release, including our internal CI or technology-specific standard cells, scripts and tools. This is not necessary to use Carfield.Carfield uses Cheshire
as main dependency. Compared to the other dependencies, Cheshire provides most of the HW/SW infrastructure used by Carfield. All Cheshire's make
targets, described in the dedicated documentation, are available in Carfield through the inclusion of the makefrag cheshire.mk
in carfield.mk
.
A target is an end use for Carfield. Each target requires different steps from here; read the page for your desired target in the following Targets chapter.
"},{"location":"tg/","title":"Targets","text":"A target refers to an end use of Carfield. This could be a simulation setup, an FPGA or ASIC implementation, or the less common integration into other SoCs.
Target setups can either be included in this repository or live in an external repository and use Cheshire as a dependency.
"},{"location":"tg/#included-targets","title":"Included Targets","text":"Included target setups live in the target
directory. Each included target has a documentation page in this chapter:
For ASIC implementation target, where an additional wrapper is needed for clock generation blocks, bidirectional pads or additional circuitry, or the less common integration into larger SoCs, Carfield may be included either as a Bender dependency or Git submodule. For further information and best pratices, see SoC Integration.
"},{"location":"tg/integr/","title":"SoC Integration","text":"Carfield is a complex platform, therefore the case of it being integrated in larger SoCs is rare. A more common scenario is the use of Carfield in a ASIC wrapper that includes bidirectional pads, clock generation blocks (PLLs, FLLs...) or other circuitry.
This page explain how to integrate Carfield to fulfill on of these needs. Since Carfield heavily relies on Cheshire, for better understanding we suggest to integrate this reading with its equivalent in the Cheshire's documentation.
"},{"location":"tg/integr/#using-carfield-in-your-project","title":"Using Carfield In Your Project","text":"As for internal targets, Carfield must be built before use in external projects. We aim to simplify this as much as possible with a portable make fragment, carfield.mk
.
If you use GNU Make to build your project and Bender to handle dependencies, you can include the Carfield build system into your own makefile with:
include $(shell bender path carfield)/carfield.mk\n
All of Carfield's build targets are available with the prefix car-
.
You can leverage this to ensure your Carfield build is up to date and rebuild hardware and software whenever necessary. You can change the default value of any build parameter, replace source files to adapt Carfield, or reuse parts of its build system, such as the software stack or the register and ROM generators.
"},{"location":"tg/integr/#instantiating-carfield","title":"Instantiating Carfield","text":"A minimal clean instantiation would look as follows:
`include \"cheshire/typedef.svh\"\n\n// Define function to derive configuration from defaults.\n// This could also (preferrably) be done in a system package.\nfunction automatic cheshire_pkg::cheshire_cfg_t gen_cheshire_cfg();\n cheshire_pkg::cheshire_cfg_t ret = cheshire_pkg::DefaultCfg;\n // Make overriding changes. Here, we add two AXI manager ports\n ret.AxiExtNumMst = 2;\n return ret;\nendfunction\n\nlocalparam cheshire_cfg_t CheshireCfg = gen_cheshire_cfg();\n\n// Generate interface types prefixed by `csh_` from our configuration.\n`CHESHIRE_TYPEDEF_ALL(csh_, CheshireCfg)\n\n// Instantiate Cheshire with our configuration and interface types.\n carfield #(\n .Cfg ( DutCfg ),\n .HypNumPhys ( NumPhys ),\n .HypNumChips ( NumChips ),\n .reg_req_t ( reg_req_t ),\n .reg_rsp_t ( reg_rsp_t )\n ) dut (\n // ... IOs here ...\n );\n
"},{"location":"tg/integr/#verifying-cheshire-in-system","title":"Verifying Cheshire In-System","text":"To simplify the simulation and verification of Carfield in other systems, we provide a monolithic block of verification IPs called carfield_vip
. This is used along with the X_vip
modules of other domains, such as Cheshire, Safe domain and Secure domain. Their description can be found in the associated domain's documentation. In particular, carfield_ip
currently includes:
Additionally, we provide a module carfield_vip_tristate
which adapts the unidirectional IO of this module to bidirectional IOs which may be interfaced with pads where necessary.
This page describes how to simulate Carfield to execute baremetal programs. Please first read Getting Started to make sure you have all the dependencies and initialized your repository.
We currently provide working setups for:
>= 2022.3
We plan on supporting more simulators in the future. If your situation requires it, simulating Carfield on other setups should be straightforward.
"},{"location":"tg/sim/#testbench","title":"Testbench","text":"Carfield comprises several bootable domains, that are described in the Architecture section.
Each of these domains can be independently booted by keeping the rest of the SoC asleep through the domain JTAG, or Cheshire's JTAG and Serial Link, which have access to the whole platform except for the secure domain.
Alternatively, some domains can offload baremetal programs to other domains at runtime. This is common pratice when offloading programs to the accelerator domain from the host or safe domains.
Note that while runtime offloading can be exploited by RTL simulation with reasonably-sized programs, we suggest to follow the FPGA mapping steps and use OpenMP-based offload with heterogeneous cross-compilation.
We provide a single SystemVerilog testbench for carfield_soc
that handles standalone execution of baremetal programs for each domain. The code for domain X
is preloaded through simulated interface drivers. In addition, some domains can read from external memory models from their boot ROM and then jump to execution.
As for Cheshire, Carfield testbench employs physical interfaces (JTAG or Serial Link) for memory preload by default. This could increase the memory preload time (independently from the target memory: dynamic SPM, LLC-SPM, or DRAM), significantly based on the ELF size.
Since by default all domains are clock gated and isolated after POR except for the host domain (Cheshire), as described in Architecture, the testbench handles the wake-up process.
To speed up the process, the external DRAM can be initialized in simulation (namely, at time 0ns
) for domain X
through the make variable HYP_USER_PRELOAD
. Carfield SW Stack provides automatic generation of the required *.slm
files, targeting an HyperRAM configured with two physical chips. Note, this flow is not recommended during ASIC development cycle as it may hide bugs in the physical interfaces.
X
X_BOOTMODE
X_PRELMODE
Action CHS
, SAFED
, SECD
, PULPD
, SPATZD
0 0 Preload through JTAG CHS
, SAFED
, SECD
, PULPD
, SPATZD
0 1 Preload through serial link Preloading boot modes expect an ELF executable to be passed through X_BINARY
.
X
CHS_BOOTMODE
CHS_PRELMODE
Action CHS
0 2 Preload through UART CHS
1-3 - Autonomous boot, see Boot ROM Autonomous boot modes expect a disk image (GPT formatted or raw code) to be passed through X_IMAGE
. For more information on how to build software for Carfield and the details on the boot process of each domain, see Software Stack.
For simulation of Carfield in other designs, or in ASIC wrappers that reside in other repositories, we provide the module carfield_vip
encapsulating all verification IPs and their interfaces.
After building Carfield, the design can be compiled and simulated with QuestaSim. Below, we provide an example with Serial Link
passive preload of a baremetal program helloworld.car.l2.elf
to be executed on the host domain (Cheshire, i.e., X=CHS
):
# Compile design\nmake car-hw-build\n\n# Preload `helloworld.car.l2.elf` through serial link, then start and run simulation\nmake car-hw-sim CHS_BOOTMODE=0 CHS_PRELMODE=1 CHS_BINARY=./sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
The design needs to be recompiled only when hardware is changed.
"},{"location":"tg/sim/#debugging","title":"Debugging","text":"Per default, Questasim compilation is performance-optimised, and GUI and simulation logging are disabled. To enable full visibility, logging, and the Questa GUI, set DEBUG=1
when executing the steps above.
Currently, synthesis of Carfield is available with closed source tools, and hence its scripts are added in the nonfree
repository mentioned in the Getting Started section.
Once open-EDA and open-PDK flow is available, it will be updated in this page.
For independent synthesis of carfield by external users, we provide a wrapper under target/synth/carfield_synth_wrap.sv
.
This page describes how to map Carfield on Xilinx FPGAs to execute baremetal programs or boot CVA6 Linux. Please first read Getting Started to make sure have all dependencies. Additionally, for on-chip debugging you need:
>= 0.10.0
We currently provide working setups for:
>= 2020.2
We are working on support for more boards in the future.
The Carfield bitstreams are divided in two flavors at the moment. The flavor_vanilla
and the flavor_bd
.
flavor_vanilla
- The hardware to be mapped on the FPGA is fully described in System Verilog. This flow is lightweight, easily reproducible, and self contained. As each IP is integrated by hand in the RTL, only the Xilinx DDR, Xilinx VIO and Xilinx clock wizard are available (at the moment).
flavor_bd
- In order to allow for more complex top level, this flow relies on the Vivado block design flow to link Carfield with external IPs. This flow is less human readable but allows integrating more complex IPs as Xilinx Ethernet. Note that this may require you to own the respective licenses.
Due to the structure of the Makefile flow. All the following commands are to be executed at the root of the Carfield repository. If you want to see the Makefiles that you will be using, you can find the generic FPGA rules in target/xilinx/xilinx.mk
and the vanilla specific rules in target/xilinx/flavor_vanilla/flavor_vanilla.mk
.
First, make sure that you have fetch and generated all the RTL:
make car-init\n
Generate the bitstream in target/xilinx/out/
by running:
make car-xil-all XILINX_FLAVOR=vanilla [VIVADO=version]\n[VIVADO_MODE={batch,gui}] [XILINX_BOARD={vcu128}] [NO_HYPERBUS={0,1}]\n[GEN_EXT_JTAG={0,1}] [GEN_PULP_CLUSTER={0,1}] [GEN_SAFETY_ISLAND={0,1}]\n[GEN_SPATZ_CLUSTER={0,1}] [GEN_OPEN_TITAN={0,1}]\n
See the argument list below:
Argument Relevance Description VIVADO all Vivado command to use XILINX_BOARD allvcu128
NO_HYPERBUS all 0
Use the hyperram controller inside carfield.sv
1
Use the Xilinx DDR controller GEN_EXT_JTAG vcu128 0
Connect the JTAG debugger to the board's JTAG (see vcu128) 1
Connect the JTAG debugger to an external JTAG chain GEN_[IP] all 0
Replace the IP with an AXI error slave1
Instanciate the IP VIVADO_MODE all batch
Compile in Vivado shellgui
Compile in Vivado gui See below some typical building time for reference:
IPs Board Duration PULP vcu128 xxhxxmin SAFETY vcu128 xxhxxmin SPATZ vcu128 xxhxxmin PULP + SAFETY vcu128 xxhxxminYou can find which sources are used by looking at Bender.yml
(target all(xilinx, fpga, xilinx_vanilla)
). This file is used by bender to generate target/xilinx/flavor_vanilla/scripts/add_sources.tcl
. You can open this file to see all the file list of the project. (Note that even if you disable an IP, its files will still needed by Vivado and added to the add_sources.tcl
).
Note that the make
command above will first compile the Xilinx ips located in target/xilinx/xilinx_ips
before compiling the bitstream.
Please read and try to compile a vanilla bitstream first to identify potential issues.
You can find the bd specific rules in target/xilinx/flavor_vanilla/flavor_bd.mk
.
Again, make sure that you have fetched and generated all the RTL:
make car-init\n
Generate the bitstream in target/xilinx/out/
by running:
make car-xil-all XILINX_FLAVOR=bd [VIVADO=version] [VIVADO_MODE={batch,gui}]\n[XILINX_BOARD={vcu128}] [NO_HYPERBUS={0,1}] [GEN_EXT_JTAG={0,1}]\n[GEN_PULP_CLUSTER={0,1}] [GEN_SAFETY_ISLAND={0,1}] [GEN_SPATZ_CLUSTER={0,1}]\n[GEN_OPEN_TITAN={0,1}]\n
See the argument list below:
Argument Relevance Description VIVADO all Vivado command to use XILINX_BOARD allvcu128
NO_HYPERBUS all 0
Use the hyperram controller inside carfield.sv
1
Use the Xilinx DDR controller GEN_EXT_JTAG vcu128 0
Connect the JTAG debugger to the board's JTAG (see vcu128) 1
Connect the JTAG debugger to an external JTAG chain GEN_[IP] all 0
Replace the IP with an AXI error slave1
Instanciate the IP VIVADO_MODE all batch
Compile in Vivado shellgui
Compile in Vivado gui See below some typical building time for reference:
IPs Board Duration PULP vcu128 xxhxxmin SAFETY vcu128 xxhxxmin SPATZ vcu128 xxhxxmin PULP + SAFETY vcu128 xxhxxminYou can find which sources are used by looking at Bender.yml
(target all(xilinx, fpga, xilinx_bd)
). This file is used by bender to generate target/xilinx/flavor_bd/scripts/add_sources.tcl
. You can open this file to see all the file list of the project. (Note that even if you disable an IP, its files will still needed by Vivado and added to the add_sources.tcl
).
Note that the make
command above will first package a Carfield ip before compiling the bitstream.
As there are no switches on this board, the CVA6 bootmode (see Cheshire bootrom) is selected by Xilinx VIOs that can be set in the Vivado GUI (see Using Vivado GUI).
"},{"location":"tg/xilinx/#external-jtag-chain","title":"External JTAG chain","text":"The VCU128 development board only provides one JTAG chain, used by Vivado to program the bitstream, and interact with certain Xilinx IPs (ILAs, VIOs, ...). The RV64 requires access to a JTAG chain to connect GDB to the debug-module in the bitstream.
When using EXT_JTAG=0
it is possible to connect the debug module to the internal FPGA's JTAG by using the Xilinx BSCANE macro. With this, you will only need the normal Xilinx USB cable to interact with CVA6. Note that it means that Vivado and OpenOCD can not use the same cable at the same time. WARNING: this setup (with EXT_JTAG=0
) will only work for designs containing the host only as it is not possible to chain multiple devices on the BSCANE macro. If you need to use EXT_JTAG=0
consider modifying the RTL to remove the debug modules of the IPs.
When using EXT_JTAG=1
we add an external JTAG chain for the RV64 host and other island through the FPGA's GPIOs. Since the VCU128 does not have GPIOs we use we use a Digilent JTAG-HS2 cable connected to the Xilinx XM105 FMC debug card. See the connections in vcu128.xdc
.
If you have closed Vivado, or compiled in batch mode, you can open the Vivado GUI:
# Find your project\nfind . -name \"*.xpr\"\n# Open it in gui\nvitis-2020.2 vivado project.xpr\n
You can now open the Hardware Manager and program the FPGA. Once done, Vivado will give you access the to Virtual Inputs Outputs (VIOs). You can now assert the following signals (on Cheshire top level).
VIO Function vio_reset Positive edge-sensitive reset for the whole system vio_boot_mode Override the boot-mode switches described above vio_boot_mode_sel Select between 0: using boot mode switches 1: use boot mode VIO"},{"location":"tg/xilinx/#using-command-line","title":"Using command line","text":"A script program.tcl
is available to flash the bitstream without opening Vivado GUI. You will need to give the following variable to access your board (see target/xilinx/xilinx.mk
).
XILINX_PORT
- Vivado opened port (default 3121)FPGA_PATH
- Vivado path to your FPGA (default xilinx_tcf/Xilinx/[serial_id])XILINX_HOST
- Path to your Vivado server (default localhost)Change the values to the appropriate ones (they be found in the Hardware Manager in Vivado GUI) and program the board:
make chs-xil-program VIVADO_MODE=batch XILINX_BOARD=vcu128 XILINX_FLAVOR=flavor\n
"},{"location":"tg/xilinx/#loading-binary-and-debugging-with-openocd","title":"Loading binary and debugging with OpenOCD","text":"Tbd
"},{"location":"tg/xilinx/#running-baremetal-code","title":"Running Baremetal Code","text":"Tbd
"},{"location":"tg/xilinx/#jtag-preloading","title":"JTAG Preloading","text":"Tbd
"},{"location":"tg/xilinx/#booting-linux","title":"Booting Linux","text":"To boot Linux, we must load the OpenSBI firmware, which takes over M mode and launches the U-boot bootloader. U-boot then loads Linux. For more details, see Boot Flow.
Clone the carfield
branch of CVA6 SDK at the root of this repository and build the firmware (OpenSBI + U-boot) and Linux images (this will take about 30 minutes):
git clone https://github.com/pulp-platform/cva6-sdk.git --branch carfield\nmake -C cva6-sdk images\n
In principle, we can boot Linux through JTAG by loading all images into memory, launching OpenSBI, and instructing U-boot to load the kernel directly from memory. Here, we focus on autonomous boot from SD card or SPI flash.
In this case, OpenSBI is loaded by a regular baremetal program called the Zero-Stage Loader (ZSL). The boot ROM loads the ZSL from SD card, which then loads the device tree and firmware from other SD card partitions into memory and launches OpenSBI.
To create a full Linux disk image from the ZSL, device tree, firmware, and Linux, run:
# Place the cva6-sdk where they are expected:\nln -s cva6-sdk/install64 sw/boot/install64\n# Optional: Pre-uild explicitely the image\nmake CAR_ROOT=. sw/boot/linux_carfield_bd_vcu128.gpt.bin\n
You can now recompile the board, it should start booting automatically!
"},{"location":"tg/xilinx/#xilinx-vcu128_1","title":"Xilinx VCU128","text":"This board does not offer a SD card reader. We need to load the image in the integrated flash:
make chs-xil-flash VIVADO_MODE=batch XILINX_BOARD=vcu128 XILINX_FLAVOR=flavor
Use the parameters defined in Using command line (defaults are in target/xilinx/xilinx.mk
) to select your board:
This script will erase your bitstream, once the flash has been written (c.a. 10min) you will need to re-program the bitstream on the board.
"},{"location":"tg/xilinx/#add-your-own-board","title":"Add your own board","text":"If you wish to add a flow for a new FPGA board, please do the following steps: Please consider opening a pull request containing the necessary changes to integrate your new board (:
"},{"location":"tg/xilinx/#makefile","title":"Makefile","text":"Add your board on top of target/xilinx/xilinx.mk
, in particular xilinx_part
and xilinx_board_long
are identifying the FPGA chip and board (can be found in VIvado GUI). The parameters identifying your personal device XILINX_PORT
, XILINX_FPGA_PATH
, XILINX_HOST
can be left empty for now.
// Indicate that you need to debug a signal\n(* dont_touch = \"yes\" *) (* mark_debug = \"true\" *) logic signal_d0;\n// You can also use the following macro from phy_definitions.svh\n`ila(ila_signal_d0, signal_d0)\n
Then, re-build your bitstream.
"},{"location":"tg/xilinx/#re-arametrize-existing-ips","title":"Re-arametrize existing IPs","text":"Carfield's emulation requires a few Vivado IPs to work properly. They are defined and pre-compiled in target/xilinx/xilinx_ips/*
. If you add a new board, you will need to reconfigure your IPs for this board. For instance, to use the Vivado MIG DDR4 controller, modify target/xilinx/xilinx_ips/xlnx_mig_ddr4/run.tcl
. There, add the relevant $::env(XILINX_BOARD)
entry with your configuration. To know which configuration to use your board, you can open a blank project in Vivado GUI, create a blank block design, and instanciate the MIG DDR4 IP there. The Vivado TCL console should write the default parameters for your FPGA. You can later re-configure the IP in the block design and Vivado will print to the tcl console the modified parameters. Then you can copy these tcl lines to the run.tcl
file. Make sure that you added your ip to target/xilinx/flavor_vanilla/flavor_vanilla.mk
under \"xilinx_ips_names_vanilla_your_board\".
If your board require a new IP that has not been integrated already do the following :
target/xilinx/xilinx_ips/[your_ip]
taking the example of the xlnx_mig_ddr4
.target/xilinx/xilinx_ips/[your_ip]/tcl/run.tcl
and target/xilinx/xilinx_ips/[your_ip]/Makefile
accordingly. > - Add your IP to target/flavor_vanilla/flavor_vanilla.mk
under \"xilinx_ips_names_vanilla_your_board\".Connect it's top module in the top-level: target/xilinx/flavor_vanilla/src/cheshire_top_xilinx.sv
. If your IP is a DDR controller, please add it to target/xilinx/src/dram_wrapper_xilinx.sv
. Note that this file contains a pipeline to resize AXI transactions from Cheshire to your controller.
Add the relevant macro parameters to target/xilinx/flavor_vanilla/src/phy_definitions.sv
in order to disable your IP for non-relevant boards.
Each board is defined by a device-tree, when adding a new board, please add a device tree in sw/boot
for each supported flavors.
It is possible to use ILA (Integrated Logic Analyzers) in order to debug some signals on the running FPGA. Add the following before declaring your signals:
"},{"location":"um/","title":"User Manual","text":"The user manual provides detailed reference information on Carfield:
TODO @anga93: add figure
Carfield is organized in domains. As a mixed-criticality system (MCS), each domain serves different purposes in terms of functional safety and reliability, security, and computation capabiities.
Carfield relies on Cheshire as ain host domain, and extends its minimal SoC with additional interconnect ports and interrupts. Hence, several features described in this section can be found
The above block diagram depicts a fully-featured Carfield SoC, which currently provides:
Computing Domain:
Memory Domain:
Mailbox unit
Platform control registers (PCRs)
Interconnect (as in Cheshire):
Interrupts (as in Cheshire):
Peripheral Domain:
This section shows Carfield's memory map. The group Internal to Cheshire
in the table below mirrors the memory map described in the dedicatd documentation for Cheshire and is explicitely shown here for clarity.
0x0000_0000_0000
0x0000_0004_0000
0x04_0000
256 KiB Debug Debug CVA6 0x0000_0004_0000
0x0000_0100_0000
Reserved 0x0000_0100_0000
0x0000_0100_1000
0x00_1000
4 KiB Config AXI DMA Config 0x0000_0100_1000
0x0000_0200_0000
Reserved 0x0000_0200_0000
0x0000_0204_0000
0x04_0000
256 KiB Memory Boot ROM 0x0000_0204_0000
0x0000_0208_0000
0x04_0000
256 KiB Irq CLINT 0x0000_0208_0000
0x0000_020c_0000
0x04_0000
256 KiB Irq IRQ Routing 0x0000_020c_0000
0x0000_0210_0000
0x04_0000
256 KiB Irq AXI-REALM unit 0x0000_020c_0000
0x0000_0300_0000
Reserved 0x0000_0300_0000
0x0000_0300_1000
0x00_1000
4 KiB Config CSRs 0x0000_0300_1000
0x0000_0300_2000
0x00_1000
4 KiB Config LLC 0x0000_0300_2000
0x0000_0300_3000
0x00_1000
4 KiB I/O UART 0x0000_0300_3000
0x0000_0300_4000
0x00_1000
4 KiB I/O I2C 0x0000_0300_4000
0x0000_0300_5000
0x00_1000
4 KiB I/O SPIM 0x0000_0300_5000
0x0000_0300_6000
0x00_1000
4 KiB I/O GPIO 0x0000_0300_6000
0x0000_0300_7000
0x00_1000
4 KiB Config Serial Link 0x0000_0300_7000
0x0000_0300_8000
0x00_1000
4 KiB Config VGA 0x0000_0300_8000
0x0000_0300_A000
0x00_1000
8 KiB Config UNBENT (bus error unit) 0x0000_0300_A000
0x0000_0300_B000
0x00_1000
4 KiB Config Tagger (cache partitioning) 0x0000_0300_8000
0x0000_0400_0000
Reserved 0x0000_0400_0000
0x0000_1000_0000
0x40_0000
64 MiB Irq PLIC 0x0000_0800_0000
0x0000_0C00_0000
0x40_0000
64 MiB Irq CLICs 0x0000_1000_0000
0x0000_1400_0000
0x40_0000
64 MiB Memory LLC Scratchpad 0x0000_1400_0000
0x0000_1800_0000
0x40_0000
64 MiB Memory LLC Scratchpad 0x0000_1800_0000
0x0000_2000_0000
Reserved -------------------------- ------------------------- ------------------ ---------- -------------- ----------------------------------------- External to Cheshire -------------------------- ------------------------- ------------------ ---------- -------------- ----------------------------------------- 0x0000_2000_0000
0x0000_2000_1000
0x00_1000
4 KiB I/O ETHERNET 0x0000_2000_1000
0x0000_2000_2000
0x00_1000
4 KiB I/O CAN BUS 0x0000_2000_2000
0x0000_2000_3000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_3000
0x0000_2000_4000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_4000
0x0000_2000_5000
0x00_1000
4 KiB I/O GP TIMER 1 (System timer) 0x0000_2000_5000
0x0000_2000_6000
0x00_1000
4 KiB I/O GP TIMER 2 (Advanced timer) 0x0000_2000_6000
0x0000_2000_7000
0x00_1000
4 KiB I/O GP TIMER 3 0x0000_2000_7000
0x0000_2000_8000
0x00_1000
4 KiB I/O WATCHDOG timer 0x0000_2000_8000
0x0000_2000_9000
0x00_1000
4 KiB I/O (empty) 0x0000_2000_9000
0x0000_2000_a000
0x00_1000
4 KiB I/O HyperBUS 0x0000_2000_a000
0x0000_2000_b000
0x00_1000
4 KiB I/O Pad Config 0x0000_2000_b000
0x0000_2000_c000
0x00_1000
4 KiB I/O L2 ECC Config 0x0000_2001_0000
0x0000_2001_1000
0x00_1000
4 KiB I/O Carfield Control and Status 0x0000_2002_0000
0x0000_2002_1000
0x00_1000
4 KiB I/O PLL/CLOCK 0x0000_2800_1000
0x0000_4000_0000
Reserved 0x0000_4000_0000
0x0000_4000_1000
0x00_1000
4 KiB Irq Mailboxes 0x0000_4000_1000
0x0000_5000_0000
Reserved 0x0000_5000_0000
0x0000_5080_0000
0x80_0000
8 MiB Accelerators Integer Cluster 0x0000_5080_0000
0x0000_5100_0000
Reserved 0x0000_5100_0000
0x0000_5180_0000
0x80_0000
8 MiB Accelerators FP Cluster 0x0000_5100_0000
0x0000_6000_0000
Reserved 0x0000_6000_0000
0x0000_6002_0000
0x02_0000
128 KiB Safe domain Safety Island Memory 0x0000_6002_0000
0x0000_6020_0000
0x1e_0000
Safe domain reserved 0x0000_6020_0000
0x0000_6030_0000
0x10_0000
Safe domain Safety Island Peripherals 0x0000_6030_0000
0x0000_6080_0000
0x50_0000
Safe domain reserved 0x0000_6080_0000
0x0000_7000_0000
Reserved 0x0000_7000_0000
0x0000_7002_0000
0x02_0000
128 KiB Memory LLC Scratchpad 0x0000_7800_0000
0x0000_7810_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 1, interleaved) 0x0000_7810_0000
0x0000_7820_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 1, non-interleaved) 0x0000_7820_0000
0x0000_7830_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 2, interleaved) 0x0000_7830_0000
0x0000_7840_0000
0x10_0000
1 MiB Memory L2 Scratchpad (Port 2, non-interleaved) 0x0000_8000_0000
0x0020_8000_0000
0x20_0000_0000
128 GiB Memory LLC/DRAM"},{"location":"um/arch/#interrupt-map","title":"Interrupt map","text":"Carfield's interrupt components are exhaustivly described in the dedicated section of the documentation for Cheshire. This section describes Carfield's interrupt map.
Interrupt Source Interrupt sink Bitwidth Connection Type Comment Carfield peripherals --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- intr_wkup_timer_expired_o 1 car_wdt_intrs[0] level-sensitive intr_wdog_timer_bark_o 1 car_wdt_intrs[1] level-sensitive nmi_wdog_timer_bark_o 1 car_wdt_intrs[2] level-sensitive wkup_req_o 1 car_wdt_intrs[3] level-sensitive aon_timer_rst_req_o 1 car_wdt_intrs[4] level-sensitive irq 1 car_can_intr level-sensitive ch_0_o[0] 1 car_adv_timer_ch0 edge-sensitive ch_0_o[1] 1 car_adv_timer_ch1 edge-sensitive ch_0_o[2] 1 car_adv_timer_ch2 edge-sensitive ch_0_o[3] 1 car_adv_timer_ch3 edge-sensitive events_o[0] 1 car_adv_timer_events[0] edge-sensitive events_o[1] 1 car_adv_timer_events[1] edge-sensitive events_o[2] 1 car_adv_timer_events[2] edge-sensitive events_o[3] 1 car_adv_timer_events[3] edge-sensitive irq_lo_o 1 car_sys_timer_lo edge-sensitive irq_hi_o 1 car_sys_timer_hi edge-sensitive --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Cheshire peripherals --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- zero 1 zero level-sensitive uart 1 uart level-sensitive i2c_fmt_threshold 1 i2c_fmt_threshold level-sensitive i2c_rx_threshold 1 i2c_rx_threshold level-sensitive i2c_fmt_overflow 1 i2c_fmt_overflow level-sensitive i2c_rx_overflow 1 i2c_rx_overflow level-sensitive i2c_nak 1 i2c_nak level-sensitive i2c_scl_interference 1 i2c_scl_interference level-sensitive i2c_sda_interference 1 i2c_sda_interference level-sensitive i2c_stretch_timeout 1 i2c_stretch_timeout level-sensitive i2c_sda_unstable 1 i2c_sda_unstable level-sensitive i2c_cmd_complete 1 i2c_cmd_complete level-sensitive i2c_tx_stretch 1 i2c_tx_stretch level-sensitive i2c_tx_overflow 1 i2c_tx_overflow level-sensitive i2c_acq_full 1 i2c_acq_full level-sensitive i2c_unexp_stop 1 i2c_unexp_stop level-sensitive i2c_host_timeout 1 i2c_host_timeout level-sensitive spih_error 1 spih_error level-sensitive spih_spi_event 1 spih_spi_event level-sensitive gpio 32 gpio level-sensitive --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Spatz cluster --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- msip_i[0] 1 (hostd_spatzcl_mb_intr_ored[0] | safed_spatzcl_intr_mb[0]) level-sensitive Snitch core #0 msip_i[1] 1 (hostd_spatzcl_mb_intr_ored[1] | safed_spatzcl_intr_mb[1]) level-sensitive Snitch core #1 mtip_i[0] 1 chs_mti[0] level-sensitive Snitch core #0 mtip_i[1] 1 chs_mti[1] level-sensitive Snitch core #1 meip_i 2 - unconnected seip_i 2 - unconnected --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- HRM integer cluster --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- eoc_o 1 pulpcl_eoc level-sensitive mbox_irq_i 1 (hostd_pulpcl_mb_intr_ored | safed_pulpcl_intr_mb) level-sensitive to offload binaries --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Secure domain --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- irq_ibex_i 1 (hostd_secd_mb_intr_ored | safed_secd_intr_mb) level-sensitive to wake-up Ibex core --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Safe domain --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- irqs_i[0] 1 hostd_safed_mbox_intr[0] level-sensitive from host domain CVA6#0 irqs_i[1] 1 hostd_safed_mbox_intr[1] level-sensitive from host domain CVA6#1 irqs_i[2] 1 secd_safed_mbox_intr level-sensitive from secure domain irqs_i[3] 1 pulpcl_safed_mbox_intr level-sensitive from HMR custer irqs_i[4] 1 spatzcl_safed_mbox_intr level-sensitive from vectorial cluster irqs[5] 1 irqs_distributed_249 level-sensitive tied to 0 irqs[6] 1 irqs_distributed_250 level-sensitive host domain UART irqs[7] 1 irqs_distributed_251 level-sensitive i2c_fmt_threshold irqs[8] 1 irqs_distributed_252 level-sensitive i2c_rx_threshold irqs[9] 1 irqs_distributed_253 level-sensitive i2c_fmt_overview irqs[10] 1 irqs_distributed_254 level-sensitive i2c_rx_overflow irqs[11] 1 irqs_distributed_255 level-sensitive i2c_nak irqs[12] 1 irqs_distributed_256 level-sensitive i2c_scl_interference irqs[13] 1 irqs_distributed_257 level-sensitive i2c_sda_interference irqs[14] 1 irqs_distributed_258 level-sensitive i2c_stret h_timeout irqs[15] 1 irqs_distributed_259 level-sensitive i2c_sda_unstable irqs[16] 1 irqs_distributed_260 level-sensitive i2c_cmd_complete irqs[17] 1 irqs_distributed_261 level-sensitive i2c_tx_stretch irqs[18] 1 irqs_distributed_262 level-sensitive i2c_tx_overflow irqs[19] 1 irqs_distributed_263 level-sensitive i2c_acq_full irqs[20] 1 irqs_distributed_264 level-sensitive i2c_unexp_stop irqs[21] 1 irqs_distributed_265 level-sensitive i2c_host_timeout irqs[22] 1 irqs_distributed_266 level-sensitive spih_error irqs[23] 1 irqs_distributed_267 level-sensitive spih_spi_event irqs[55:24] 32 irqs_distributed_299:268 level-sensitive gpio irqs_i[56] 1 irqs_distributed_300 level-sensitive pulpcl_eoc irqs_i[57] 1 irqs_distributed_309 level-sensitive car_wdt_intrs[0] irqs_i[58] 1 irqs_distributed_310 level-sensitive car_wdt_intrs[1] irqs_i[59] 1 irqs_distributed_311 level-sensitive car_wdt_intrs[2] irqs_i[60] 1 irqs_distributed_312 level-sensitive car_wdt_intrs[3] irqs_i[61] 1 irqs_distributed_313 level-sensitive car_wdt_intrs[4] irqs_i[62] 1 irqs_distributed_314 level-sensitive car_can_intr irqs_i[63] 1 irqs_distributed_315 edge-sensitive car_adv_timer_ch0 irqs_i[64] 1 irqs_distributed_316 edge-sensitive car_adv_timer_ch1 irqs_i[65] 1 irqs_distributed_317 edge-sensitive car_adv_timer_ch2 irqs_i[66] 1 irqs_distributed_318 edge-sensitive car_adv_timer_ch3 irqs_i[67] 1 irqs_distributed_319 edge-sensitive car_adv_timer_events[0] irqs_i[68] 1 irqs_distributed_320 edge-sensitive car_adv_timer_events[1] irqs_i[69] 1 irqs_distributed_321 edge-sensitive car_adv_timer_events[2] irqs_i[70] 1 irqs_distributed_322 edge-sensitive car_adv_timer_events[0] irqs_i[71] 1 irqs_distributed_323 edge-sensitive car_sys_timer_lo irqs_i[72] 1 irqs_distributed_324 edge-sensitive car_sys_timer_hi irqs_i[127:73] 54 irqs_distributed_331:325 - tied to 0 --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- Cheshire --------------------------- -------------------- -------------- ------------------------------------------------------------- ----------------- --------------------------- intr_ext_i[0] 1 pulpcl_eoc level-sensitive from HMR cluster intr_ext_i[2:1] 2 pulpcl_hostd_mbox_intr level-sensitive from HMR cluster intr_ext_i[4:3] 2 spatzcl_hostd_mbox_intr level-sensitive from vectorial cluster intr_ext_i[6:5] 2 safed_hostd_mbox_intr level-sensitive from safe domain intr_ext_i[8:7] 2 secd_hostd_mbox_intr level-sensitive from secure domain intr_ext_i[9] 1 car_wdt_intrs[0] level-sensitive from carfield peripherals intr_ext_i[10] 1 car_wdt_intrs[1] level-sensitive from carfield peripherals intr_ext_i[11] 1 car_wdt_intrs[2] level-sensitive from carfield peripherals intr_ext_i[12] 1 car_wdt_intrs[3] level-sensitive from carfield peripherals intr_ext_i[13] 1 car_wdt_intrs[4] level-sensitive from carfield peripherals intr_ext_i[14] 1 car_can_intr level-sensitive from carfield peripherals intr_ext_i[15] 1 car_adv_timer_ch0 edge-sensitive from carfield peripherals intr_ext_i[16] 1 car_adv_timer_ch1 edge-sensitive from carfield peripherals intr_ext_i[17] 1 car_adv_timer_ch2 edge-sensitive from carfield peripherals intr_ext_i[18] 1 car_adv_timer_ch3 edge-sensitive from carfield peripherals intr_ext_i[19] 1 car_adv_timer_events[0] edge-sensitive from carfield peripherals intr_ext_i[20] 1 car_adv_timer_events[1] edge-sensitive from carfield peripherals intr_ext_i[21] 1 car_adv_timer_events[2] edge-sensitive from carfield peripherals intr_ext_i[22] 1 car_adv_timer_events[3] edge-sensitive from carfield peripherals intr_ext_i[23] 1 car_sys_timer_lo edge-sensitive from carfield peripherals intr_ext_i[24] 1 car_sys_timer_hi edge-sensitive from carfield peripherals intr_ext_i[31:25] 7 0 tied to 0 meip_ext_o[0] - level-sensitive unconnected meip_ext_o[1] - level-sensitive unconnected meip_ext_o[2] - level-sensitive unconnected seip_ext_o[0] - level-sensitive unconnected seip_ext_o[1] - level-sensitive unconnected seip_ext_o[2] - level-sensitive unconnected msip_ext_o[0] - level-sensitive unconnected msip_ext_o[1] - level-sensitive unconnected msip_ext_o[2] - level-sensitive unconnected mtip_ext_o[0] - level-sensitive Snitch core #0 mtip_ext_o[1] - level-sensitive Snitch core #1 mtip_ext_o[2] - level-sensitive unconnected"},{"location":"um/arch/#domains","title":"Domains","text":"We divide Carfield domains in two macro groups, the Computing Domain and the Memory Domain. They are both fragmented into smaller domains, described in the following two sections.
The total number of domains is 7 (computing: host domain, safe domain, secure domain, integer PMCA domain, vectorial PMCA domain, peripheral domain, memory: dynamic SPM domain).
Note for the reader
Carfield's domains live in dedicated repositories. We invite the reader to consult the documentation of each domain for more information. Below, we focus on integration parameterization within Carfield.
"},{"location":"um/arch/#computing-domain","title":"Computing Domain","text":""},{"location":"um/arch/#host-domain-cheshire","title":"Host domain (Cheshire)","text":"The host domain (Cheshire) embeds all the necessary components required to run OSs such as embedded Linux. It has two orthogonal operation modes.
Untrusted mode: in this operation mode, the host domain is tasked to run untrusted services, i.e. non time- and safety-critical applications. For example, this could be the case of infotainment on a modern car. In this mode, as in traditional automotive platforms, safety and resiliency features are deferred to a dedicated 32-bit microcontroller-like system, called safe domain
in Carfield.
Hybrid trusted/untrusted mode: in this operation mode, the host domain is in charge of both critical and non-critical applications. Key features to achieve this are:
physical tagger
in front of the cores to mark partitions by acting directly on the physical address spaceHybrid operation mode is currently experimental, and mostly for research purposes. We advise of relying on a combination of host ad safe domain for a more traditional approach.
Cheshire is configured as follows:
CarfieldNumExtIntrs
), see Interrupt map. Unused are tied to 0 (currently 9/32)CarfieldNumInterruptibleHarts
). The interruptible harts are Snitch core #0 and #1 in the vectorial cluster.CarfieldNUmRouterTargets
), tasked to distribute N input interrupts to M targets. In Carfield, the external target is the safe domain
.The safe domain is a simple MCU-like domain that comprises three 32-bit real-time CV32E40P (CV32RT) RISC-V cores operating in triple-lockstep mode (TCLS).
These cores, enhanced with the RISC-V CLIC controller and optimized for fast interrupt handling and context switch, run RTOSs and safety-critical applications, embodying a core tenet of the platform reliability.
The safe domain is essential when the host domain is operated in untrusted mode.
The safe domain is configured as follows:
The secure domain, based on the OpenTitan project, serves as the Hardware Root-of-Trust (HWRoT) of the platform. It handles secure boot and system integrity monitoring fully in HW through cryptographic acceleration services.
Compared to vanilla OpenTitan, the secure domain integrated in Carfield is modified/configured as follows:
1 AXI4 manager interface to Carfield, with a bridge between AXI4 and TileLink Uncached Lightweight (TL-UL) internally used by OpenTitan. By only exposing a manager port, unwanted access to the secure domain is prevented.
Embedded flash memory replaced with a simple SRAM preloaded before secure boot procedure from an external SPI flash through OpenTitan private SPI peripheral. Once preload is over, the OpenTitan secure boot framework is unchanged compared to the original.
Finally, a boot manager module has been designed and integrated to manage the two available bootmodes. In secure mode, the systems executes the secure boot as soon as the reset is asserted, loading code from the external SPI and performing the signature check on its content. Otherwise, the secure domain is clock gated and must be clocked and woken-up by an external entity (e.g., host domain)
To augment computational capabilities, Carfield incorporates two general-purpose accelerators
"},{"location":"um/arch/#hmr-integer-pmca","title":"HMR integer PMCA","text":"The hybrid modular redundancy (HMR) integer PMCA is specialized in executing reliable boosted Quantized Neural Network (QNN) operations, exploiting the HMR technique for rapid fault recovery and integer arithmetic support in the ISA of the RISC-V cores from 32-bit down to 2-bit and mixed-precision formats.
The HMR integer PMCA is configured as follows:
TODO
"},{"location":"um/arch/#vectorial-pmca","title":"Vectorial PMCA","text":"The vectorial PMCA, or Spatz PMCA handles vectorizable multi-format floating-point workloads.
It acts as a coprocessor of the Snitch core, a tiny 64-bit scalar core which decodes and forwards vector instructions to the vector unit. Together they are referred to as Complex Cores (CCs).
The vectorial PMCA is composed by two CCs, each with the following configurations:
Each FPU supports FP8, FP16, FP32, and FP64 computation, while the IPU supports 8, 16, 32, and 64-bit integer computation.
The CCs share access to 128KB of L1 scratchpad memory divided into 16 SRAM banks.
We
"},{"location":"um/arch/#memory-domain","title":"Memory Domain","text":""},{"location":"um/arch/#dynamic-scratchpad-memory-spm","title":"Dynamic scratchpad memory (SPM)","text":"The dynamic SPM features dynamically switching address mapping policy. It manages the following features:
Carfield integrates a in-house, open-source implementation of Infineon' HyperBus off-chip link to connect to external HyperRAM modules.
Despite describing it as part of the Memory Domain, the HyperBus is logically part of the peripheral domain.
It manages the following features:
The interconnect is composed of a main AXI4 matrix (or crossbar) with AXI5 atomic operations (ATOPs) support. The crossbar extends Cheshire's with additional external AXI manager and subordinate ports.
Cheshire's auxiliary Regbus demultiplexer is extended with additional peripheral configuration ports for external PLL/FLL and padmux configuration, which are specific of ASIC wrappers.
An additional peripheral subsystem based on APB hosts Carfield-specific peripherals.
"},{"location":"um/arch/#mailbox-unit","title":"Mailbox unit","text":"The mailbox unit consists in a number of configurable mailboxes. Each mailbox is the preferred communication vehicle between domains. It can be used to wake-up certain domains, notify an offloader (e.g., Cheshire) that a target device (e.g., the integer PMCA) has reached execution completion, dispatch entry points to a target device to jump-start its execution, and many others.
It manages the following features:
Assuming each mailbox is identified with id i
, the register file map reads:
0x00 + i * 0x100
INT_SND_STAT
1
current irq status 0x04 + i * 0x100
INT_SND_SET
1
set irq 0x08 + i * 0x100
INT_SND_CLR
1
clear irq 0x0C + i * 0x100
INT_SND_EN
1
enable irq 0x40 + i * 0x100
INT_RCV_STAT
1
current irq status 0x44 + i * 0x100
INT_RCV_SET
1
set irq 0x48 + i * 0x100
INT_RCV_CLR
1
clear irq 0x4C + i * 0x100
INT_RCV_EN
1
enable irq 0x80 + i * 0x100
LETTER0
32
message 0x8C + i * 0x100
LETTER1
32
message The above register map can be found in the dedicated repository and is reported here for convenience.
TODO @alex96295: Add figure
"},{"location":"um/arch/#platform-control-registers","title":"Platform control registers","text":"PCRs provide basic system information, and control clock, reset and other functionalities of Carfield's domains.
A more detailed overview of each PCR (register subfields and description) can be found here. PCR base address is listed in the Memory Map as for the other devices.
Name Offset Length DescriptionVERSION0
0x0
4
Cheshire sha256 commit VERSION1
0x4
4
Safety Island sha256 commit VERSION2
0x8
4
Security Island sha256 commit VERSION3
0xc
4
PULP Cluster sha256 commit VERSION4
0x10
4
Spatz CLuster sha256 commit JEDEC_IDCODE
0x14
4
JEDEC ID CODE GENERIC_SCRATCH0
0x18
4
Scratch GENERIC_SCRATCH1
0x1c
4
Scratch HOST_RST
0x20
4
Host Domain reset -active high, inverted in HW- PERIPH_RST
0x24
4
Periph Domain reset -active high, inverted in HW- SAFETY_ISLAND_RST
0x28
4
Safety Island reset -active high, inverted in HW- SECURITY_ISLAND_RST
0x2c
4
Security Island reset -active high, inverted in HW- PULP_CLUSTER_RST
0x30
4
PULP Cluster reset -active high, inverted in HW- SPATZ_CLUSTER_RST
0x34
4
Spatz Cluster reset -active high, inverted in HW- L2_RST
0x38
4
L2 reset -active high, inverted in HW- PERIPH_ISOLATE
0x3c
4
Periph Domain AXI isolate SAFETY_ISLAND_ISOLATE
0x40
4
Safety Island AXI isolate SECURITY_ISLAND_ISOLATE
0x44
4
Security Island AXI isolate PULP_CLUSTER_ISOLATE
0x48
4
PULP Cluster AXI isolate SPATZ_CLUSTER_ISOLATE
0x4c
4
Spatz Cluster AXI isolate L2_ISOLATE
0x50
4
L2 AXI isolate PERIPH_ISOLATE_STATUS
0x54
4
Periph Domain AXI isolate status SAFETY_ISLAND_ISOLATE_STATUS
0x58
4
Safety Island AXI isolate status SECURITY_ISLAND_ISOLATE_STATUS
0x5c
4
Security Island AXI isolate status PULP_CLUSTER_ISOLATE_STATUS
0x60
4
PULP Cluster AXI isolate status SPATZ_CLUSTER_ISOLATE_STATUS
0x64
4
Spatz Cluster AXI isolate status L2_ISOLATE_STATUS
0x68
4
L2 AXI isolate status PERIPH_CLK_EN
0x6c
4
Periph Domain clk gate enable SAFETY_ISLAND_CLK_EN
0x70
4
Safety Island clk gate enable SECURITY_ISLAND_CLK_EN
0x74
4
Security Island clk gate enable PULP_CLUSTER_CLK_EN
0x78
4
PULP Cluster clk gate enable SPATZ_CLUSTER_CLK_EN
0x7c
4
Spatz Cluster clk gate enable L2_CLK_EN
0x80
4
Shared L2 memory clk gate enable PERIPH_CLK_SEL
0x84
4
Periph Domain pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SAFETY_ISLAND_CLK_SEL
0x88
4
Safety Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SECURITY_ISLAND_CLK_SEL
0x8c
4
Security Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) PULP_CLUSTER_CLK_SEL
0x90
4
PULP Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) SPATZ_CLUSTER_CLK_SEL
0x94
4
Spatz Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) L2_CLK_SEL
0x98
4
L2 Memory pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) PERIPH_CLK_DIV_VALUE
0x9c
4
Periph Domain clk divider value SAFETY_ISLAND_CLK_DIV_VALUE
0xa0
4
Safety Island clk divider value SECURITY_ISLAND_CLK_DIV_VALUE
0xa4
4
Security Island clk divider value PULP_CLUSTER_CLK_DIV_VALUE
0xa8
4
PULP Cluster clk divider value SPATZ_CLUSTER_CLK_DIV_VALUE
0xac
4
Spatz Cluster clk divider value L2_CLK_DIV_VALUE
0xb0
4
L2 Memory clk divider value HOST_FETCH_ENABLE
0xb4
4
Host Domain fetch enable SAFETY_ISLAND_FETCH_ENABLE
0xb8
4
Safety Island fetch enable SECURITY_ISLAND_FETCH_ENABLE
0xbc
4
Security Island fetch enable PULP_CLUSTER_FETCH_ENABLE
0xc0
4
PULP Cluster fetch enable SPATZ_CLUSTER_DEBUG_REQ
0xc4
4
Spatz Cluster debug req HOST_BOOT_ADDR
0xc8
4
Host boot address SAFETY_ISLAND_BOOT_ADDR
0xcc
4
Safety Island boot address SECURITY_ISLAND_BOOT_ADDR
0xd0
4
Security Island boot address PULP_CLUSTER_BOOT_ADDR
0xd4
4
PULP Cluster boot address SPATZ_CLUSTER_BOOT_ADDR
0xd8
4
Spatz Cluster boot address PULP_CLUSTER_BOOT_ENABLE
0xdc
4
PULP Cluster boot enable SPATZ_CLUSTER_BUSY
0xe0
4
Spatz Cluster busy PULP_CLUSTER_BUSY
0xe4
4
PULP Cluster busy PULP_CLUSTER_EOC
0xe8
4
PULP Cluster end of computation ETH_RGMII_PHY_CLK_DIV_EN
0xec
4
Ethernet RGMII PHY clock divider enable bit ETH_RGMII_PHY_CLK_DIV_VALUE
0xf0
4
Ethernet RGMII PHY clock divider value ETH_MDIO_CLK_DIV_EN
0xf4
4
Ethernet MDIO clock divider enable bit ETH_MDIO_CLK_DIV_VALUE
0xf8
4
Ethernet MDIO clock divider value"},{"location":"um/arch/#peripheral-domain","title":"Peripheral Domain","text":"Carfield enhances Cheshire's peripheral subsystem with additional capabilities.
An external AXI manager port is attached to the matrix crossbar. The 64-bit data, 48-bit address AXI protocol is converted to the slower, 32-bit data and address APB protocol. An APB demultiplexer allows attaching several peripherals, described below.
"},{"location":"um/arch/#generic-and-advanced-timer","title":"Generic and advanced timer","text":"Carfield integrates a generic timer and an advanced timer.
The generic timer manages the following features:
For more information, read the dedicated documentation.
The advanced timer manages the following features:
For more information, read the dedicated documentation.
"},{"location":"um/arch/#watchdog-timer","title":"Watchdog timer","text":"We employ the watchdog timer developed by the OpenTitan project project. It manages the following features:
For more information, read the dedicated documentation.
"},{"location":"um/arch/#can","title":"CAN","text":"We employ a CAN device developed by the Czech Technical University in Prague. It manages the following features:
For more information, read the dedicated documentation
"},{"location":"um/arch/#ethernet","title":"Ethernet","text":"We employ Ethernet IPs developed by Alex Forencich and assemble them with a high-performant DMA, the same used in Cheshire.
We use Reduced gigabit media-independent interface (RGMII) that supports speed up to 1000Mbit/s (1GHz).
For more information, read the dedicated documentation of Ethernet components from its original repository.
"},{"location":"um/arch/#clock-and-reset","title":"Clock and reset","text":"The two figures above show the clock, reset and isolation distribution for a domain X
in Carfield, and their relationship. A more detailed description is provided below.
Carfield is provided with 3 clocks sources. They can be fully asynchronous and not bound to any phase relationship, since dual-clock FIFOs are placed between domains to allow clock domain crossing (CDC):
host_clk_i
: preferably, clock of the host domainalt_clk_i
: preferably, clock of alternate domains, namely safe domain, secure domain, accelerator domainper_clk_i
: preferably, clock of peripheral domainIn addition, a real-time clock (RTC, rt_clk_i
) is provided externally, at crystal frequency (32kHz) or higher.
These clocks are supplied externally, by a dedicated PLL per clock source or by a single PLL that supplies all three clock sources. The configuration of the clock source can be handled by the external PLL wrapper configuration registers, e.g. in a ASIC top level
Regardless of the specific name used for the clock signals in HW, Carfield has a flexible clock distribution that allows each of the 3 clock sources to be assigned to a domain, as explained below.
As the top figure shows, out of the 7 domains described in Domains, 6 can be clock gated and isolated: safe domain, secure domain, accelerator domain, peripheral domain, dynamic SPM.
When isolation for a domain X
is enabled, data transfers towards a domain are terminated and never reach it. To achieve this, an AXI4 compliant isolation module is placed in front of each domain. The bottom figure shows in detail the architecture of the isolation scheme between the host domain and a generic X
domain, highlighting its relationship with the domain's reset and cloc signals.
For each of the 6 clock gateable domains, the following clock distribution scheme applies:
HW resources for the clock distribution (steps 1., 2., and 3.) and isolation of a domain X
, are SW-controlled via dedicated PCRs. Refer to Platform Control Registers in this page for more information.
The only domain that is always-on and de-isolated is the host domain (Cheshire). If required, clock gating and/or isolation of it can be handled at higher levels of hierarchy, e.g. in a dedicated ASIC wrapper.
"},{"location":"um/arch/#startup-behavior-after-power-on-reset-por","title":"Startup behavior after Power-on reset (POR)","text":"The user can decide whether secure boot must be performed on the executing code before runtime. If so, the secure domain must be active after POR, i.e., clocked and de-isolated. This behavior is regulated by the input pin secure_boot_i
according to the following table:
secure_boot_i
Secure Boot System status after POR 0
OFF
secure domain gated and isolated as the other 5 domains, host domain always-on and idle 1
ON
host domain always-on and idle, secure domain active, takes over secure boot and can't be warm reset-ed; other 5 domains gated and isolated Regardless of the value of secure_boot_i
, since by default some domains are clock gated and isolated after POR, SW or external physical interfaces (JTAG/Serial Link) must handle their wake-up process. Routines are provided in the Software Stack.
Carfield is provided with one POR (active-low), pwr_on_rst_ni
, responsible for the platform's cold reset.
The POR is synchronized with the clock of each domain, user-selected as explained above, and propagated to the domain.
In addition, a warm reset can be initiated from SW through the PCRs for each domain. Exceptions to this are the host domain (always-on), and the secure domain when secure_boot_i
is asserted.
Carfield's Software Stack is provided in the sw/
folder, organized as follows:
sw\n\u251c\u2500\u2500 boot\n\u251c\u2500\u2500 include\n\u251c\u2500\u2500 lib\n\u251c\u2500\u2500 link\n\u251c\u2500\u2500 sw.mk\n\u251c\u2500\u2500 tests\n \u00a0\u00a0 \u251c\u2500\u2500 bare-metal\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 hostd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 pulpd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 safed\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 secd\n \u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 spatzd\n \u00a0\u00a0 \u2514\u2500\u2500 linux\n
Employing Cheshire as host domain, Carfield's software stack is largely based on, and built on top of, Cheshire's.
This means that it shares the same:
Therefore, we defer the reader to Cheshire's Software Stack description for more information.
Programs compiled for Carfield that requiree additional, Carfield-specific drivers (for domains' offload, peripheral control, etc) are linked against Cheshire's static library (libcheshire.a
). This operation is transparent to the programmer, that can take advantage of Cheshire's device drivers and SW routines within Carfield seamlessly.
Provided the equivalence and reuse between Carfield and Cheshire, in this page we focus on Carfield-specific SW components and build flow, with an emphasis on domains different than Cheshire.
"},{"location":"um/sw/#compiler-requirements","title":"Compiler requirements","text":"General-purpose processing elements (PEs) integrated in Carfield implement the RISC-V ISA, targeting either RV64 (host domain) or RV32 (all the others: safe domain, secure domain, integer PMCA, and vectorial PMCA).
To build programs written in plain C for a Carfield domain with the base ISA and its regular extensions (namely, RV64G
and RV32IMACF
) without using custom extensions that each domain provide, you simply need vanilla RV64 and RV32 compilers.
Otherwise, to use custom instruction supported in HW for a domain, specific compiler support is required. We are working to improve compiler support by providing pointers to pre-built releases or a container-based build flow.
"},{"location":"um/sw/#boot-flow-and-secure-boot","title":"Boot Flow and Secure Boot","text":"Carfield supports two operative boot flows:
Non-secure: being an always-on domain, in this operative boot flow Cheshire takes over Carfield's boot flow. This means that passive and autonomous boot are equivalent to those described in Cheshire's Software Stack. Since the other domains are clock gated, SW to be executed on them requires Cheshire to handle their wake-up sequence.
Secure: The secure domain performs the secure boot process on the code that will be executed on the Carfield system, independently of the domain. For more information, read the dedicated secure boot documentation of the OpenTitan project.
BMPs for all domains can be built from the root of Carfield through a portable make fragment sw.mk
located in the sw/
folder.
To simplify each domain SW build as much as possible, we provide a make fragment located at sw/tests/bare-metal/<domain>/sw.mk
, included in the main sw.mk
.
BMPs for each domain are compiled in situ in the domain repository, since each IP was design for, or supports also, standalone execution and has its own build flow.
The global command
make car-sw-build\n
builds program binaries in ELF format for each domain, which can be used with the simulation methods supported by the platform, as described in Simulation or on FPGA as described in Xilinx FPGAs.
As in Cheshire, Carfield programs can be created to be executed from several memory locations:
l2
): the linkerscript is provided in Carfield's sw/link/
folder, since Dynamic SPM is not integrated in the minimal Cheshirespm
): valid when the LLC is configured as such. In Carfield, half of the LLC is configured as SPM from the boot ROM during system bringup, as this is the default behavior in Cheshire.dram
): the HyperRAMFor example, to build a specific BMP (here sw/tests/bare-metal/hostd/helloworld.c
to be run on Cheshire) executing from the Dynamic SPM, run:
make sw/tests/bare-metal/hostd/helloworld.car.l2.elf\n
To create the same program executing from DRAM, sw/tests/bare-metal/hostd/helloworld.car.dram.elf
can instead be built from the same source. Depending on their assumptions and behavior, not all programs may be built to execute from both locations.
When executing host domain programs in Linux (on FPGA/ASIC targets) that require access to memory mapped components of other domains, SW intervention is needed to map virtual to physical addresses, since domains different than the host currently lack support for HW-based virtual memory translation.
In the current SW stack, this mapping is already provided and hence transparent to the user. Test programs targeting Linux that require it are located in different folder, sw/tests/linux/<domain>
.
Offload of programs to Carfield domains involves:
Programs can be offloaded with:
Simple baremetal offload (BMO), recommended for regression tests use cases that are simple enough to be executed with cycle-accurate RTL simulations. For instance, this can be the case of dynamic timing analysis (DTA) carried out during an ASIC development cycle.
The OpenMP API, recommended when developing SW for Carfield on a FPGA or, eventually, ASIC implementing Carfield, because of the ready-to-use OS support (currently, Linux). Usage of the OpenMP API with non OS-directed (baremetal) SW can be supported, but is mostly suited for heterogeneous embedded systems with highly constrained resources
In the following, we briefly describe both.
Note for the reader
Since by default all domains are clock gated and isolated after POR except for the host domain (Cheshire), as described in Architecture, the wake-up process must be handled from the C source code.
"},{"location":"um/sw/#baremetal-offload","title":"Baremetal offload","text":"For BMO, the offloader takes care of bootstrapping the target device ELF in the correct memory location, initializing the target and launching its execution through a simple ELF Loader. The ELF Loader source code is located in the offloader's SW directory, and follows a naming convention:
<target_device>_offloader_<blocking | non_blocking>.c \n
The target device's ELF is included into the offloader's ELF Loader as a header file. The target device's ELF sections are first pre-processed offline to extract instruction addresses.The resulting header file provides the ELF loading process at the selected memory location. The loading process can be carried out by the offloader as R/W sequences, or deferred to a DMA-driven memcopy. In addition, the offloader takes care of bootstrapping the target device, i.e. initializing it and launching its execution.
Upon target device completion, the offloader:
Currently, blocking BMO is implemented.
As an example, assume the host domain as offloader and the integer PMCA as target device.
sw/tests/bare-metal/hostd
sw/tests/bare-metal/pulpd
The resulting offloader ELF's name reads:
<target_device>_offloader_<blocking | non_blocking>.<target_device_test_name>.car.<l2 | spm | dram>.elf\n
According to the memory location where the BMP will be executed.
The final offloader ELF can be preloaded with simulation methods described in the Simulation section, and can be built again as explained above.
Note for the reader
BMO is in general not recommended for developing SW for Carfield, as it was introduced during ASIC development cycle and can be an effective litmus test to find and fix HW bugs, or during DTA.
For SW development on Carfield and in particular domain-driven offload, it is recommended to use OpenMP offload on FPGA/ASIC, described below.
"},{"location":"um/sw/#openmp-offload-recommended-use-on-fpgaasic","title":"OpenMP offload (recommended: use on FPGA/ASIC)","text":"TODO Cyril
"},{"location":"um/sw/#external-benchmarks","title":"External benchmarks","text":"We support several external benchmarks, whose build flow has been slightly adapted to align with Carfield's. Currently, they are:
To augment computational capabilities, Carfield incorporates two general-purpose accelerators
The vectorial PMCA, or Spatz PMCA handles -vectorizable multi-format floating-point workloads (down to FP8).
-The Spatz PMCA is configured as follows:
-TODO
+vectorizable multi-format floating-point workloads. +It acts as a coprocessor of the Snitch core, a +tiny 64-bit scalar core which decodes and forwards vector instructions to the vector unit. Together +they are referred to as Complex Cores (CCs).
+The vectorial PMCA is composed by two CCs, each with the following configurations:
+Each FPU supports FP8, FP16, FP32, and FP64 computation, while the IPU supports 8, 16, 32, +and 64-bit integer computation.
+The CCs share access to 128KB of L1 scratchpad memory divided into 16 SRAM banks.
+We
The dynamic SPM features dynamically switching address mapping policy. It manages the following @@ -3061,71 +3089,71 @@
Offset | -Register | -Width (bit) | -Note | +Offset | +Register | +Width (bit) | +Note |
---|---|---|---|---|---|---|---|
0x00 + i * 0x100 | -INT_SND_STAT | -1 | +0x00 + i * 0x100 |
+INT_SND_STAT |
+1 |
current irq status | |
0x04 + i * 0x100 | -INT_SND_SET | -1 | +0x04 + i * 0x100 |
+INT_SND_SET |
+1 |
set irq | |
0x08 + i * 0x100 | -INT_SND_CLR | -1 | +0x08 + i * 0x100 |
+INT_SND_CLR |
+1 |
clear irq | |
0x0C + i * 0x100 | -INT_SND_EN | -1 | +0x0C + i * 0x100 |
+INT_SND_EN |
+1 |
enable irq | |
0x40 + i * 0x100 | -INT_RCV_STAT | -1 | +0x40 + i * 0x100 |
+INT_RCV_STAT |
+1 |
current irq status | |
0x44 + i * 0x100 | -INT_RCV_SET | -1 | +0x44 + i * 0x100 |
+INT_RCV_SET |
+1 |
set irq | |
0x48 + i * 0x100 | -INT_RCV_CLR | -1 | +0x48 + i * 0x100 |
+INT_RCV_CLR |
+1 |
clear irq | |
0x4C + i * 0x100 | -INT_RCV_EN | -1 | +0x4C + i * 0x100 |
+INT_RCV_EN |
+1 |
enable irq | |
0x80 + i * 0x100 | -LETTER0 | -32 | +0x80 + i * 0x100 |
+LETTER0 |
+32 |
message | |
0x8C + i * 0x100 | -LETTER1 | -32 | +0x8C + i * 0x100 |
+LETTER1 |
+32 |
message |
Name | -Offset | -Length | -Description | +Name | +Offset | +Length | +Description |
---|---|---|---|---|---|---|---|
VERSION0 |
0x0 |
-4 | +4 |
Cheshire sha256 commit | |||
VERSION1 |
0x4 |
-4 | +4 |
Safety Island sha256 commit | |||
VERSION2 |
0x8 |
-4 | +4 |
Security Island sha256 commit | |||
VERSION3 |
0xc |
-4 | +4 |
PULP Cluster sha256 commit | |||
VERSION4 |
0x10 |
-4 | +4 |
Spatz CLuster sha256 commit | |||
JEDEC_IDCODE |
0x14 |
-4 | +4 |
JEDEC ID CODE | |||
GENERIC_SCRATCH0 |
0x18 |
-4 | +4 |
Scratch | |||
GENERIC_SCRATCH1 |
0x1c |
-4 | +4 |
Scratch | |||
HOST_RST |
0x20 |
-4 | +4 |
Host Domain reset -active high, inverted in HW- | |||
PERIPH_RST |
0x24 |
-4 | +4 |
Periph Domain reset -active high, inverted in HW- | |||
SAFETY_ISLAND_RST |
0x28 |
-4 | +4 |
Safety Island reset -active high, inverted in HW- | |||
SECURITY_ISLAND_RST |
0x2c |
-4 | +4 |
Security Island reset -active high, inverted in HW- | |||
PULP_CLUSTER_RST |
0x30 |
-4 | +4 |
PULP Cluster reset -active high, inverted in HW- | |||
SPATZ_CLUSTER_RST |
0x34 |
-4 | +4 |
Spatz Cluster reset -active high, inverted in HW- | |||
L2_RST |
0x38 |
-4 | +4 |
L2 reset -active high, inverted in HW- | |||
PERIPH_ISOLATE |
0x3c |
-4 | +4 |
Periph Domain AXI isolate | |||
SAFETY_ISLAND_ISOLATE |
0x40 |
-4 | +4 |
Safety Island AXI isolate | |||
SECURITY_ISLAND_ISOLATE |
0x44 |
-4 | +4 |
Security Island AXI isolate | |||
PULP_CLUSTER_ISOLATE |
0x48 |
-4 | +4 |
PULP Cluster AXI isolate | |||
SPATZ_CLUSTER_ISOLATE |
0x4c |
-4 | +4 |
Spatz Cluster AXI isolate | |||
L2_ISOLATE |
0x50 |
-4 | +4 |
L2 AXI isolate | |||
PERIPH_ISOLATE_STATUS |
0x54 |
-4 | +4 |
Periph Domain AXI isolate status | |||
SAFETY_ISLAND_ISOLATE_STATUS |
0x58 |
-4 | +4 |
Safety Island AXI isolate status | |||
SECURITY_ISLAND_ISOLATE_STATUS |
0x5c |
-4 | +4 |
Security Island AXI isolate status | |||
PULP_CLUSTER_ISOLATE_STATUS |
0x60 |
-4 | +4 |
PULP Cluster AXI isolate status | |||
SPATZ_CLUSTER_ISOLATE_STATUS |
0x64 |
-4 | +4 |
Spatz Cluster AXI isolate status | |||
L2_ISOLATE_STATUS |
0x68 |
-4 | +4 |
L2 AXI isolate status | |||
PERIPH_CLK_EN |
0x6c |
-4 | +4 |
Periph Domain clk gate enable | |||
SAFETY_ISLAND_CLK_EN |
0x70 |
-4 | +4 |
Safety Island clk gate enable | |||
SECURITY_ISLAND_CLK_EN |
0x74 |
-4 | +4 |
Security Island clk gate enable | |||
PULP_CLUSTER_CLK_EN |
0x78 |
-4 | +4 |
PULP Cluster clk gate enable | |||
SPATZ_CLUSTER_CLK_EN |
0x7c |
-4 | +4 |
Spatz Cluster clk gate enable | |||
L2_CLK_EN |
0x80 |
-4 | +4 |
Shared L2 memory clk gate enable | |||
PERIPH_CLK_SEL |
0x84 |
-4 | +4 |
Periph Domain pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
SAFETY_ISLAND_CLK_SEL |
0x88 |
-4 | +4 |
Safety Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
SECURITY_ISLAND_CLK_SEL |
0x8c |
-4 | +4 |
Security Island pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
PULP_CLUSTER_CLK_SEL |
0x90 |
-4 | +4 |
PULP Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
SPATZ_CLUSTER_CLK_SEL |
0x94 |
-4 | +4 |
Spatz Cluster pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
L2_CLK_SEL |
0x98 |
-4 | +4 |
L2 Memory pll select (0 -> host pll, 1 -> alt PLL, 2 -> per pll) | |||
PERIPH_CLK_DIV_VALUE |
0x9c |
-4 | +4 |
Periph Domain clk divider value | |||
SAFETY_ISLAND_CLK_DIV_VALUE |
0xa0 |
-4 | +4 |
Safety Island clk divider value | |||
SECURITY_ISLAND_CLK_DIV_VALUE |
0xa4 |
-4 | +4 |
Security Island clk divider value | |||
PULP_CLUSTER_CLK_DIV_VALUE |
0xa8 |
-4 | +4 |
PULP Cluster clk divider value | |||
SPATZ_CLUSTER_CLK_DIV_VALUE |
0xac |
-4 | +4 |
Spatz Cluster clk divider value | |||
L2_CLK_DIV_VALUE |
0xb0 |
-4 | +4 |
L2 Memory clk divider value | |||
HOST_FETCH_ENABLE |
0xb4 |
-4 | +4 |
Host Domain fetch enable | |||
SAFETY_ISLAND_FETCH_ENABLE |
0xb8 |
-4 | +4 |
Safety Island fetch enable | |||
SECURITY_ISLAND_FETCH_ENABLE |
0xbc |
-4 | +4 |
Security Island fetch enable | |||
PULP_CLUSTER_FETCH_ENABLE |
0xc0 |
-4 | +4 |
PULP Cluster fetch enable | |||
SPATZ_CLUSTER_DEBUG_REQ |
0xc4 |
-4 | +4 |
Spatz Cluster debug req | |||
HOST_BOOT_ADDR |
0xc8 |
-4 | +4 |
Host boot address | |||
SAFETY_ISLAND_BOOT_ADDR |
0xcc |
-4 | +4 |
Safety Island boot address | |||
SECURITY_ISLAND_BOOT_ADDR |
0xd0 |
-4 | +4 |
Security Island boot address | |||
PULP_CLUSTER_BOOT_ADDR |
0xd4 |
-4 | +4 |
PULP Cluster boot address | |||
SPATZ_CLUSTER_BOOT_ADDR |
0xd8 |
-4 | +4 |
Spatz Cluster boot address | |||
PULP_CLUSTER_BOOT_ENABLE |
0xdc |
-4 | +4 |
PULP Cluster boot enable | |||
SPATZ_CLUSTER_BUSY |
0xe0 |
-4 | +4 |
Spatz Cluster busy | |||
PULP_CLUSTER_BUSY |
0xe4 |
-4 | +4 |
PULP Cluster busy | |||
PULP_CLUSTER_EOC |
0xe8 |
-4 | +4 |
PULP Cluster end of computation | |||
ETH_RGMII_PHY_CLK_DIV_EN |
0xec |
-4 | +4 |
Ethernet RGMII PHY clock divider enable bit | |||
ETH_RGMII_PHY_CLK_DIV_VALUE |
0xf0 |
-4 | +4 |
Ethernet RGMII PHY clock divider value | |||
ETH_MDIO_CLK_DIV_EN |
0xf4 |
-4 | +4 |
Ethernet MDIO clock divider enable bit | |||
ETH_MDIO_CLK_DIV_VALUE |
0xf8 |
-4 | +4 |
Ethernet MDIO clock divider value |