Setting up charter and prespecification for the new TG

riscv-admin · Jul 18, 2024 · 917803d · 917803d
1 parent b8f5b5b
commit 917803d
Show file tree

Hide file tree

Showing 4 changed files with 138 additions and 27 deletions.
diff --git a/CHARTER.md b/CHARTER.md
@@ -1,11 +1,17 @@
-# Microarchitecture Side-Channel Resistant Instruction Spans Task Group TG Charter
-Timing covert channels are used to exfiltrate confidential data using microarchitectural states as a medium for communications. These channels are particularly relevant in the context of microarchitectural attacks such as Spectre and Meltdown.
+# Timing Fences Task Group TG Charter
+Covert channels are communication channels that a supervisor cannot observe nor control.
+Timing channels are covert channels that exploit timing interferences caused by competition for shared microarchitectural resources, such as caches, buffers, and branch predictors.
+For instance, timing channels can be used to extract secrets as part of a microarchitectural speculation attack such as Spectre-like attacks.
 
- The Microarchitecture Side-Channel Resistant Instruction Spans Task Group (proposed short name: uSCR-IS TG) will define a small ISA extension to prevent malicious covert channels. More precisely, we will introduce a notion of side-channel resistant instruction spans, such that covert channels can be prevented across instruction spans by adapting the microarchitecture. Introducing instruction spans as an architectural feature makes it possible for higher-level program logic to declare that a sequence of instructions should be microarchitecturally isolated within a larger instruction stream (for example, a span of instructions that implement a cryptographic operation may be isolated to protect against side-channel attacks). The proposed RISC-V uSCR-IS TG will collaborate to produce:
+To prevent timing channels, shared hardware resources must be strictly partitioned between isolated applications.
+The Timing Fences Task Group will propose a small ISA extension to enable such partitioning of shared microarchitectural state.
+For instance, we will introduce a temporal fence instruction which can be used to *temporally* partition shared on-core microarchitectural state by clearing it, e.g., when switching between isolated applications.
+
+The proposed RISC-V Timing Fences TG will collaborate to produce:
  1. A small ISA extension (possibly no more than one or two instructions, or only a new CSR).
- 2. A non-normative security guide: defining threat models, developing rationale, etc. -> intended for security engineers.
- 3. A non-normative implementation guide, focusing on the principles of microarchitecture design that enable protection against covert channels. -> intended for hardware engineers.
- 4. A proof-of-concept implementation, including both a prototype RISC-V core and compiler managing the necessary intrinsics.
- 5. A test strategy guide, including a test suite for common covert channels.
+ 2. A non-normative short guide: defining threat models, developing rationale, etc.
+ 3. A proof-of-concept implementation, including both a prototype RISC-V core and compiler that manages the necessary intrinsics.
+ 4. A test strategy guide, including a test suite for common covert channels.
+ 5. The Sail model corresponding to this extension.
 
-The TG will work with the appropriate Priv/Unpriv ISA committee, Architecture Review Committee, and Security HC to determine which parts of the work should follow the standard ISA specification process, Fast Track process, or non-ISA process, and how other recent policy or process changes may apply (such as the discussion around the use of hint instructions in CFI).
+The TG will work with the appropriate Priv/Unpriv ISA committee, Architecture Review Committee, and Security HC.
diff --git a/CHARTERv2.md b/CHARTERv2.md
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 
-# Microarchitecture Side-Channel Resistant Instruction Spans (uSCR-IS)
+# Timing fences
 
-This repository represents an administrative repository for the Microarchitecture Side-Channel Resistant Instruction Spans (uSCR-IS).  
+This repository represents an administrative repository for the Timing Fences Task Group.  
 It should contain documents to facilitate the group function, e.g. meeting minutes and supporting documents.
 It should not contain code nor specifications.
 
diff --git a/prespecifications/fence_time.adoc b/prespecifications/fence_time.adoc
@@ -0,0 +1,122 @@
+= fence.time, a RISC-V extension proposal
+
+== Why fence.time: the rationale
+
+Covert channels allow unauthorized communication across security boundaries.
+Attackers can leverage these covert channels to leak data, from supervisor to user privilege levels, from one process to another, etc.
+Timing channels use the timing of events to signal information.
+Microarchitectural timing channels control event timing through the use of microarchitectural state that depends on execution history.
+
+While microarchitectural state can affect other physical quantities, such as temperature, power draw, or electromagnetic emanation,
+our threat model focuses on timing covert channels since they are easy to exploit remotely.
+We therefore consider other physical channels out of scope.
+
+Covert channels are, among others, utilised in transient-execution attacks.
+Such attacks typically use speculative execution to read a secret and then use a covert timing channel to exfiltrate the data.
+
+*Any microarchitectural state that depends on execution history* can be used to implement a covert channel. This includes CPU caches, other forms of caching (such as the TLB or branch target buffers) but also state machines as they are used in prefetchers, or for cache-line replacement. Prohibiting such microarchitectural state outright is infeasible, so we need ways to manage it securely.
+
+Furthermore, most such state is inherently shared and requiring strict partitioning between security domains is also infeasible. An automatic resetting of microarchitectural state on hardware-detectable events (such as writing the page table pointer) would be overkill, as multiple address spaces may share the same security domain.
+
+It is therefore necessary to provide a mechanism by which software can inform hardware that a switch of security domain is being performed, and sharing of microarchitectural state across this switch must be prevented.
+This is the purpose of _fence.time_.
+
+== fence.time semantics
+
+This is a first draft proposal for _fence.time_.
+
+We define the following RISC-V instruction, without source nor destination registers, but with optional flags.
+
+[,asm]
+----
+fence.time [flags]
+----
+
+The flags may be used to restrict the action of _fence.time_ to specific subsets of microarchitectural state.
+The core must guarantee the following semantics.
+
+[literal]
+The timing of any instruction or sequence of instructions executing after the fence must be independent of any microarchitectural state before the fence. The flags may exclude this requirement for some subsets of microarchitectural state.
+
+In this definition, timing is any measurable latency: we do not care if this is the actual latency of the execution of an instruction or time spent in the issue queue, or something else.
+
+Defined flags are the following:
+
+- `PRIV_SWITCH`: the _fence.time_ is associated with a privilege level change.
+- `AS_SWITCH`: the _fence.time_ is associated with an address space change.
+- `INT_SWITCH`: the _fence.time_ is associated with an interrupt.
+- `VM_SWITCH`: the _fence.time_ is associated with a VM change.
+
+Any combination of flags is valid. See <<section-split,Partitioning section>> for more details on how these flags may be used.
+
+== Implementation guidelines
+
+
+=== No timing dependencies
+
+One way of enforcing _fence.time_ semantics is to have instructions executing in constant time.
+Arithmetic operations (usually excluding division and sometimes multiplication) typically execute in constant time.
+
+The RISC-V extension Zkt mandates execution in in data-independent manner for a given list of instructions.
+*Our requirements here are different*.
+
+We recognize that instruction execution time is a fuzzy concept on a modern complex microarchitecture: is it only the time spent in the execute stage ? Or does it comprise microarchitectural behaviours such as instruction scheduling ?
+
+In our case, timings designate *any* measurable timings (with `RDTIME` for example).
+
+=== Flush
+
+The most direct way to prevent microarchitectural state timing dependency is to reset the state to a deterministic value.
+Hence, _fence.time_ may be implemented as flushing all state, as long as the flush is completed before any future access of the state.
+
+[example]
+In the case of a simple data cache, flushing includes invalidating all data present. *However, that on its own is not enough.* For example, it is necessary to erase any metadata that is used for cache-line allocation to ensure future execution is completely independent on any execution prior to executing _fence.time_.
+
+WARNING: Any microarchitectural state left intact accross _fence.time_ may still be used to support a covert channel.
+
+[[section-split]]
+=== Partitioning
+
+An alternative to flushing may be partitioning of state.
+
+What to flush is microarchitecture dependent.
+But usually the biggest threats are with cache and branch predictions mechanisms.
+But flushing can be costly, this is why, in some specific cases, we may prefer to partition resources instead.
+Supporting _fence.time_ may imply microarchitectural changes.
+
+Some interfaces should require a total microarchitectural flush but the cost of it is unmanageable. A partitioning scheme can be used instead.
+In this case, _fence.time_ *may* avoid to flush the corresponding structures.
+
+This is the purpose of the flags that indicate that some microarchitectural state may *not* be flushed.
+
+[example]
+An application is performing a system call and the privilege level is switched to supervisor.
+You want to prevent covert channels, but your core already have branch predictor states partitioned by privilege level.
+By flagging the switch as `fence.time PRIV_SWITCH`, the hardware can decide to not flush the branch predictor states.
+
+=== Reorder barrier
+
+With its semantics so defined, _fence.time_ imposes that out of order cores cannot reorder the fence *for instructions impacting the microarchitectural state*.
+It is effectively a reorder barrier.
+
+=== Propagation to the bus / peripherals
+
+Covert channels are not only supported by the core microarchitectural state.
+The attacker can also use peripherals states, accessible and modifiable from the core, as such.
+
+The _fence.time_ semantics MUST be propagated to the core bus, to all peripherals so that they correctly deal with it.
+
+[example]
+The simplest example is the last level cache (LLC), shared by several cores which can be used to create a covert channel. But flushing the LLC is not an efficient solution, neither for performances nor security (risks of contention). To adhere to the _fence.time_ semantics, the LLC is thus required to be partitioned.
+
+== fence.time timing variability
+
+Most microarchitectural state is read-only and should be possible to reset in constant time. But this obviously does not apply to the data cache, which (if it is a write-back cache) must have all dirty lines written back before resetting. This makes the _fence.time_ execution latency inherently history-dependent. There must be a way to prevent this variable latency from being observable.
+
+One way to address this would be to force _fence.time_ to execute in constant time. Alternatively, the privileged software could contain a delay loop that pads execution time to a constant value. However, in practice this software padding may be difficult to do accurately.
+
+Nevertheless, there are benefits of decoupling latency padding from flushing. For example, software is likely to perform operations during a context switch that too have a history-dependent latency. An example is an interrupt at a known time, that do some operations before the contex switch.
+It therefore makes sense to defer the padding until after all such operations have been performed.
+
+The better approach seems to have a separate instruction for time padding, such as a `pause` instruction that halts execution until reaching a given cycle count.
+For instance, an OS can compute the target cycle count by adding a constant worst-case execution time for all history-dependend execution preceeding `pause` (e.g. _fence.time_) to the cycle count of the most recent CLINT timer interrupt, which generally arrives at a history-independent time.