diff --git a/content/en/platform/corda/4.13/enterprise/cordapps/thread-pools.md b/content/en/platform/corda/4.13/enterprise/cordapps/thread-pools.md new file mode 100644 index 0000000000..786957d905 --- /dev/null +++ b/content/en/platform/corda/4.13/enterprise/cordapps/thread-pools.md @@ -0,0 +1,163 @@ +--- +date: '2025-04-20' +menu: + corda-enterprise-4-13: + identifier: corda-enterprise-4-13-cordapps-flows-segthreadpools + parent: corda-enterprise-4-13-cordapps-flows +tags: +- api +- service +- classes +title: Using additional thread pools +weight: 10 +--- + +Corda Enterprise executes flows in *thread pools*. A thread pool is a group of pre-created, idle threads, ready to execute tasks. The default Corda Enterprise configuration creates a single thread pool, whose size is configured by the *[flowThreadPoolSize]({{< relref "../node/setup/corda-configuration-fields.html#enterpriseconfiguration" >}})* parameter. Open Source Corda is single-threaded. + +In Corda 4.12 and previous versions, only the single, default thread pool described above was supported. From Corda 4.13 onward, the Enterprise version enables operators to define *multiple* thread pools and assign flows to them. The reason for this is to enable operators to prioritize particular flows and to segregate them from other flows. + +For example, if there are slow-running reporting flows and more important transactional flows on the same system, the reporting flows can be separated into a dedicated thread pool so that they do not block the transactional flows. + +## Configuring thread pools + +Thread pools are defined in the [node configuration]({{< relref "../node/setup/corda-configuration-file.md" >}}) by adding an `additionalFlowThreadPools` array within the `tuning` object. The `additionalFlowThreadPools` array can contain one or more objects, each specifying the details of an additional thread pool. Each object contains a `threadpool` and `size` property, respectively defining the name of the thread pool and its size in number of threads. + +### Example 1: Two Defined Thread Pools + +The following sample configuration defines two thread pools based on the example above, `reporting` and `transactions`, each with three available threads: + +```json +enterpriseConfiguration { + tuning { + additionalFlowThreadPools= [ + { + threadPool=reporting, + size=3 + }, + { + threadPool=transactions, + size=3 + }, + ] + } +} +``` + +The related flows then need to be tagged accordingly: + +``` +@FlowThreadPool("reporting") +``` + +and + +``` +@FlowThreadPool("transactions") +``` + +### Example 2: One Defined Thread Pool and Default Thread Pool + +An alternative configuration, rather than defining two thread pools, could instead define one thread pool (in this case, `reporting`) but also use the default thread pool, defining its size using `flowThreadPoolSize`. As in previous versions of Corda, the size of the default thread pool (name: "default") is still specified by the *[flowThreadPoolSize]({{< relref "../node/setup/corda-configuration-fields.html#enterpriseconfiguration" >}})* parameter. + +```json +enterpriseConfiguration { + tuning { + flowThreadPoolSize = 3, + additionalFlowThreadPools= [ + { + threadPool=reporting, + size=3 + }, + ] + } +} +``` + +Only the flows related to reporting then need to be tagged accordingly: + +``` +@FlowThreadPool("reporting") +``` + +## Logging + +The Corda node's [startup log]({{< relref "../node/operating/monitoring-and-logging/overview.md" >}}) outputs the defined thread pools and their sizes; for example: + +``` +Created flow thread pools: reporting(3), transactions(3), default(20) +``` + +## Default flow-to-thread pool mapping rules + +How flows are mapped to thread pools depends on: + +- The thread flow configuration +- Whether or not the CorDapps installed have customized thread pool rules + +The Corda default FlowSchedulerMapper follows these rules, in order of highest priority first: + +1. If a flow is annotated with `@FlowThreadPool("threadpoolname")` and the referenced thread pool is defined in the configuration, then that flow is executed in the specified pool. + If the specified thread pool is not present in the node configuration, then the default thread pool is used instead. + +2. If a thread pool named `Peer-Origin` is defined, then all flows started via a peer Corda node and **not** annotated with a specific thread pool will be executed in that thread pool. Otherwise, such flows are executed in the default thread pool. + +3. If a thread pool named `RPC-Origin` is defined, then all flows started via RPC (for example, by a client application) and **not** annotated with a specific thread pool will be executed in that thread pool. Otherwise, such flows are executed in the default thread pool. + +4. If none of the above rules apply to a flow, then the default behavior is the same as in previous versions of Corda: the flow is executed in the default thread pool. + + +## Customizing flow-to-thread pool mapping rules + +CorDapps can override the above default flow mapping logic by defining a class which implements [the FlowSchedulerMapper interface](https://github.com/corda/corda/blob/feature/segregated-threadpools/core/src/main/kotlin/net/corda/core/flows/scheduler/mapper/FlowSchedulerMapper.kt); for example: + +```java +interface FlowSchedulerMapper { + fun getScheduler( + invocationContext: InvocationContext, + flowLogic: Class>, + ourIdentity: CordaX500Name + ): String +} +``` + +The default mapping logic is available [here](https://github.com/corda/corda/blob/feature/segregated-threadpools/core/src/main/kotlin/net/corda/core/flows/scheduler/mapper/FlowSchedulerMapperImpl.kt). + +**(TODO: Adjust above links later to point to the release branch.)** + +Corda scans CorDapps at startup time for classes implementing the FlowSchedulerMapper interface. +Corda logs this message if it finds a single candidate: + +``` +Using custom flow scheduler mapper. Class {classname} +``` + +If it has a constructor which accepts a set of Strings, it will use that class as a flow mapper. +Corda aborts with an exception if there is more than one class or there are no matching constructors. + +FlowSchedulerMapper constructors get the set of available additional thread pool names as an argument. +Its `getScheduler` method is called when a flow is scheduled. +Its expected return value is the thread pool's name, which is where the flow should be executed. + +Users should package their custom scheduler mapper in a separate CorDapp. This simplifies adding or removing it from the system. +Also, having the mapper in the same package as the main app would make installing multiple apps impossible due to multiple custom scheduler mappers. + +## Thread pool metrics + +The following [metric]({{< relref "../node/operating/monitoring-and-logging/node-metrics.md" >}}) was introduced in 4.13 specifically for thread pools: + +| Name | Description | +|--------------------------|-------------------------------------| +| QueueSizeTotal | The sum of all thread pool queues | + +The following metrics have now been updated to be divided by thread pool: + +| Previously | Corda 4.13 onward | +|------------------------------------------------|--------------------------------------------------------------------| +| ActiveThreads | ActiveThreads.{threadpoolname} | +| QueueSize | QueueSize.{threadpoolname} | +| QueueSizeOnInsert | QueueSizeOnInsert.{threadpoolname} | +| StartupQueueTime | StartupQueueTime.{threadpoolname} | +| FlowDuration.{Success/Failure}.{flowclassname} | FlowDuration.{Success/Failure}.{flowclassname}.{threadpoolname>} | + +Metrics related to the default thread pool do not have a *.default* suffix; this is for backward compatibility. + diff --git a/content/en/platform/corda/4.13/enterprise/node/operating/monitoring-and-logging/node-metrics.md b/content/en/platform/corda/4.13/enterprise/node/operating/monitoring-and-logging/node-metrics.md index e9c4c42121..d829bb8050 100644 --- a/content/en/platform/corda/4.13/enterprise/node/operating/monitoring-and-logging/node-metrics.md +++ b/content/en/platform/corda/4.13/enterprise/node/operating/monitoring-and-logging/node-metrics.md @@ -64,7 +64,7 @@ There are two types of caches: *size-based* and *weight-based*. Size-based cache of entries in the cache, while weight-based caches are measured in the bytes of memory occupied by the entries. {{< note >}} -The avalable set of metrics depends on the cache type. The `maximum-size` and `sizePercent` metrics are only available for size-based caches, while `maximum-weight`, `weight`, and `weightPercent` metrics are only available for weight-based caches. +The available set of metrics depends on the cache type. The `maximum-size` and `sizePercent` metrics are only available for size-based caches, while `maximum-weight`, `weight`, and `weightPercent` metrics are only available for weight-based caches. {{< /note >}} {{< table >}} @@ -90,12 +90,13 @@ The avalable set of metrics depends on the cache type. The `maximum-size` and `s ## Flows +Note that metrics related to the default thread pool do not have a *.default* suffix; this is for backward compatibility. {{< table >}} |Metric Query|Description| |----------------------------------------------------------------|--------------------------------------------------------------------------------------| -|net.corda:type=Flows,name=ActiveThreads|The total number of threads running flows.| +|net.corda:type=Flows,name=ActiveThreads.{threadpool}|The total number of threads running flows for the specified [thread pool](../../../cordapps/thread-pools.md).| |net.corda:type=Flows,name=CheckpointVolumeBytesPerSecondCurrent|The current rate at which checkpoint data is being persisted.| |net.corda:type=Flows,name=CheckpointVolumeBytesPerSecondHist|A histogram indicating the rate at which bytes are being checkpointed.| |net.corda:type=Flows,name=Checkpointing Rate|The rate at which checkpoint events are occurring.| @@ -103,14 +104,18 @@ The avalable set of metrics depends on the cache type. The `maximum-size` and `s |net.corda:type=Flows,name=ErrorPerMinute|The rate at which flows fail with an error.| |net.corda:type=Flows,name=Finished|The total number of completed flows (both successfully and unsuccessfully).| |net.corda:type=Flows,name=InFlight|The number of in-flight flows.| -|net.corda:type=Flows,name=QueueSize|The current size of the queue for flows waiting to be executed.| -|net.corda:type=Flows,name=QueueSizeOnInsert|A histogram showing the queue size at the point new flows are added.| +|net.corda:type=Flows,name=QueueSize.{threadpool}|The current size of the queue for flows waiting to be executed for the specified thread pool| +|net.corda:type=Flows,name=QueueSizeOnInsert.{threadpool}|A histogram showing the queue size at the point new flows are added for the specified thread pool| +|net.corda:type=Flows,name=QueueSizeTotal | The sum of all thread pool queues. | |net.corda:type=Flows,name=Started|The total number of flows started.| |net.corda:type=Flows,name=StartedPerMinute|The rate at which flows are started.| -|net.corda:type=Flows,name=StartupQueueTime|This timer measures the time a flow spends queued before it is executed.| +|net.corda:type=Flows,name=StartupQueueTime.{threadpool} |This timer measures the time a flow spends queued before it is executed for the specified thread pool. | |net.corda:type=Flows,name=Success|The total number of successful flows.| |net.corda:type=Flows,name=|A histogram indicating the time taken to execute a particular action. See the following section for more details.| - +|net.corda:type=Flows,name=FlowDuration.Success.{flowclassname} | The flow duration for the default thread pool of the specified flow, if successful. | +|net.corda:type=Flows,name=FlowDuration.Failure.{flowclassname}| The flow duration for the default thread pool of the specified flow, if failed. | +|net.corda:type=Flows,name=FlowDuration.Success.{flowclassname}.{threadpoolname} | The flow duration for the specified thread pool of the specified flow, if successful. | +|net.corda:type=Flows,name=FlowDuration.Failure.{flowclassname}.{threadpoolname} | The flow duration for the specified thread pool of the specified flow, if failed. | {{< /table >}} diff --git a/content/en/platform/corda/4.13/enterprise/node/operating/optimizing.md b/content/en/platform/corda/4.13/enterprise/node/operating/optimizing.md index bb9a36eda3..8c2ede6767 100644 --- a/content/en/platform/corda/4.13/enterprise/node/operating/optimizing.md +++ b/content/en/platform/corda/4.13/enterprise/node/operating/optimizing.md @@ -17,9 +17,9 @@ Node performance optimisation can be achieved by adjusting node configuration, n ## Adjusting the node settings -The main parameters that can be tweaked for a Corda Enterprise node are - +The main parameters that can be tweaked for a Corda Enterprise node are: +* The number of thread pools used; for more information, see [thread pools]({{< relref "../../cordapps/thread-pools.md" >}}). * The number of flow threads (the number of flows that can be live and active in the state machine at the same time). The default value for this is twice the number of processor cores available on the machine, capped at 30. * The number of RPC threads (the number of calls the RPC server can handle in parallel, enqueuing requests to the state machine). The default for this is the number of processor cores available on the machine * The amount of heap space the node process can allocate. The default for this is 512 megabytes. @@ -55,6 +55,7 @@ enterpriseConfiguration = { The recommended approach is to start with a low number of flow threads (e.g. 1 per gigabyte of heap memory), and increase the number of threads over a number of runs. In tests at R3, it seems that giving a node twice the number of flow threads than RPC threads seemed a sensible number, but that might depend on the hardware and the use case, so it is worthwhile to experiment with this ratio. +You can also define additional thread pools; for more information, see [Using additional thread pools]({{< relref "../../cordapps/thread-pools.md" >}}). ## Disk access diff --git a/content/en/platform/corda/4.13/enterprise/node/setup/corda-configuration-fields.md b/content/en/platform/corda/4.13/enterprise/node/setup/corda-configuration-fields.md index 31b03e5bc8..cbf2389ceb 100644 --- a/content/en/platform/corda/4.13/enterprise/node/setup/corda-configuration-fields.md +++ b/content/en/platform/corda/4.13/enterprise/node/setup/corda-configuration-fields.md @@ -346,8 +346,14 @@ Allows fine-grained controls of various features only available in the enterpris * `tuning` - * The Corda Node configuration file section that contains performance tuning parameters for Corda Enterprise Nodes. + * The Corda Node configuration file section that contains performance tuning parameters for Corda Enterprise nodes. + - `additionalFlowThreadPools` + + * The default Corda configuration creates a single thread pool whose size is configured by the *[flowThreadPoolSize]({{< relref "#enterpriseconfiguration" >}})* parameter. You can define *multiple* thread pools and assign flows to them; for example, to prioritize particular flows and to segregate them from other flows. Thread pools are defined by adding an `additionalFlowThreadPools` array within the `tuning` object. The `additionalFlowThreadPools` array can contain one or more objects, each specifying the details of an additional thread pool. Each object contains a `threadpool` and `size` property, respectively defining the name of the thread pool and its size in number of threads. + + For more information and examples, see [Setting thread pools]({{< relref "../../cordapps/thread-pools.md" >}}). + - `backchainFetchBatchSize` * This is an optimization for sharing transaction backchains. Corda Enterprise nodes can request backchain items in bulk instead of one at a time. This field specifies the size of the batch. The value is just an integer indicating the maximum number of states that can be requested at a time during backchain resolution. @@ -360,18 +366,20 @@ Allows fine-grained controls of various features only available in the enterpris - `flowThreadPoolSize` - The number of threads available to handle flows in parallel. This is the number of flows + The number of threads available to handle flows in parallel by the default thread pool. This is the number of flows that can run in parallel doing something and/or holding resources like database connections. - A larger number of flows can be suspended, for example, waiting for reply from a counterparty. + + Note that this property does not affect the size of additional thread pools as described in [Using additional thread pools]({{< relref "../../cordapps/thread-pools.md" >}}). + + A larger number of flows can be suspended; for example, waiting for reply from a counterparty. When a response arrives, a suspended flow will be woken up if there are any available threads in the thread pool. - Otherwise, a currently active flow must be finished or suspended before the suspended flow can be woken + Otherwise, a currently active flow must be finished or suspended before the suspended flow can be woken up to handle the event. This can have serious performance implications if the flow thread pool is too small, as a flow cannot be suspended while in a database transaction, or without checkpointing its state first. - Corda Enterprise allows the node operators to configure the number of threads the state machine manager can use to execute flows in parallel, allowing more than one flow to be active and/or use resources at the same time. + Corda Enterprise allows the node operators to configure the number of threads the state machine manager can use to execute flows in parallel, allowing more than one flow to be active and/or use resources at the same time. - The ideal value for this parameter depends on a number of factors. These include the hardware the node is running on, the performance profile of the flows, and the database instance backing the node as datastore. Every thread will open a database connection, so for n threads, the database system must have at least n+1 connections available. Also, the database - must be able to actually cope with the level of parallelism to make the number of threads worthwhile - if + The ideal value for this parameter depends on a number of factors. These include the hardware the node is running on, the performance profile of the flows, and the database instance backing the node as datastore. Every thread will open a database connection, so for n threads, the database system must have at least n+1 connections available. Also, the database must be able to actually cope with the level of parallelism to make the number of threads worthwhile - if using for example H2, any number beyond eight does not add any substantial benefit due to limitations with its internal architecture. For these reasons, the default size for the flow framework thread pool is the lower number between either the available number of processors times two, and 30. Overriding this value in the configuration allows you to specify any number. diff --git a/content/en/platform/corda/4.13/enterprise/performance-testing/performance-tuning.md b/content/en/platform/corda/4.13/enterprise/performance-testing/performance-tuning.md index 511654135a..72d8879835 100644 --- a/content/en/platform/corda/4.13/enterprise/performance-testing/performance-tuning.md +++ b/content/en/platform/corda/4.13/enterprise/performance-testing/performance-tuning.md @@ -82,6 +82,7 @@ The recommended approach is to start with a low number of flow threads (e.g. 1 p threads over a number of runs. In tests at R3, it seems that giving a node twice the number of flow threads than RPC threads seemed a sensible number, but that might depend on the hardware and the use case, so it is worthwhile to experiment with this ratio. +You can also define additional thread pools; for more information, see [Using additional thread pools]({{< relref "../cordapps/thread-pools.md" >}}). ### Disk access