diff --git a/.github/ISSUE_TEMPLATE/client_enhancement.md b/.github/ISSUE_TEMPLATE/client_enhancement.md index 9c630e81..b42dfaab 100644 --- a/.github/ISSUE_TEMPLATE/client_enhancement.md +++ b/.github/ISSUE_TEMPLATE/client_enhancement.md @@ -2,7 +2,7 @@ name: Client Enhancement about: Request a new feature to be added to NATS clients labels: enhancement, client -assignees: ripienaar, wallyqs, scottf, aricart, derekcollison, Jarema, piotrpio, jnmoyne, bruth, caspervonb, marthacp, levb +assignees: ripienaar, wallyqs, scottf, aricart, derekcollison, Jarema, piotrpio, jnmoyne, bruth, levb, mtmk --- ## Overview @@ -22,9 +22,10 @@ The behavior is documented in ADR-X. - [ ] JavaScript @aricart - [ ] .Net @scottf - [ ] C @levb - - [ ] Python @wallyqs + - [ ] Python @caspervonb - [ ] Ruby @wallyqs - - [ ] Rust @Jarema @caspervonb + - [ ] Rust @Jarema + - [ ] .Net V2 @mtmk ## Other Tasks diff --git a/.github/workflows/validate.yaml b/.github/workflows/validate.yaml index cf106dc0..8a7a558e 100644 --- a/.github/workflows/validate.yaml +++ b/.github/workflows/validate.yaml @@ -16,7 +16,7 @@ jobs: - name: Setup Go uses: actions/setup-go@v1 with: - go-version: 1.16 + go-version: "1.22" - name: Valid metadata and readme updated shell: bash --noprofile --norc -x -eo pipefail {0} diff --git a/.readme.templ b/.readme.templ index 1b637dec..16692a46 100644 --- a/.readme.templ +++ b/.readme.templ @@ -2,7 +2,7 @@ # NATS Architecture And Design -This repo is used to capture architectural and design decisions as a reference of the server implementation and expected client behavior. +This repository captures Architecture, Design Specifications and Feature Guidance for the NATS ecosystem. # Architecture Decision Records {{- range . }} @@ -16,29 +16,16 @@ This repo is used to capture architectural and design decisions as a reference o {{ end }} ## When to write an ADR -Not every little decision needs an ADR, and we are not overly prescriptive about the format apart from the initial header format. -The kind of change that should have an ADR are ones likely to impact many client libraries, server configuration, security, deployment -and those where we specifically wish to solicit wider community input. +We use this repository in a few ways: -For a background of the rationale driving ADRs see [Documenting Architecture Decisions](https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions) by -Michael Nygard + 1. Design specifications where a single document captures everything about a feature, examples are ADR-8, ADR-32, ADR-37 and ADR-40 + 1. Guidance on conventions and design such as ADR-6 which documents all the valid naming rules + 1. Capturing design that might impact many areas of the system such as ADR-2 + +We want to move away from using these to document individual minor decisions, moving instead to spec like documents that are living documents and can change over time. Each capturing revisions and history. ## Template Please see the [template](adr-template.md). The template body is a guideline. Feel free to add sections as you feel appropriate. Look at the other ADRs for examples. However the initial Table of metadata and header format is required to match. After editing / adding a ADR please run `go run main.go > README.md` to update the embedded index. This will also validate the header part of your ADR. - -## Related Repositories - - * Server [nats-server](https://github.com/nats-io/nats-server) - * Reference implementation [nats.go](https://github.com/nats-io/nats.go) - * Java Client [nats.java](https://github.com/nats-io/nats..java) - * .NET / C# client [nats.net](https://github.com/nats-io/nats.net) - * JavaScript [nats.ws](https://github.com/nats-io/nats.ws) [nats.deno](https://github.com/nats-io/nats.deno) - * C Client [nats.c](https://github.com/nats-io/nats.c) - * Python3 Client for Asyncio [nats.py](https://github.com/nats-io/nats.py) - -### Client Tracking - -There is a [Client Feature Parity](https://docs.google.com/spreadsheets/d/1VcYcKqwOp8h8zZwNSRXMS5wrdA1jZz6AumMTHZbXrmY/edit#gid=1032495336) spreadsheet that tracks the clients somewhat, but it is not guaranteed to be complete or up to date. diff --git a/README.md b/README.md index 5f769636..4ed35977 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ # NATS Architecture And Design -This repo is used to capture architectural and design decisions as a reference of the server implementation and expected client behavior. +This repository captures Architecture, Design Specifications and Feature Guidance for the NATS ecosystem. # Architecture Decision Records ## Client @@ -15,26 +15,26 @@ This repo is used to capture architectural and design decisions as a reference o |[ADR-5](adr/ADR-5.md)|server, client|Lame Duck Notification| |[ADR-6](adr/ADR-6.md)|server, client|Naming Rules| |[ADR-7](adr/ADR-7.md)|server, client, jetstream|NATS Server Error Codes| -|[ADR-8](adr/ADR-8.md)|jetstream, client, kv|JetStream based Key-Value Stores| +|[ADR-8](adr/ADR-8.md)|jetstream, client, kv, spec|JetStream based Key-Value Stores| |[ADR-9](adr/ADR-9.md)|server, client, jetstream|JetStream Consumer Idle Heartbeats| |[ADR-10](adr/ADR-10.md)|server, client, jetstream|JetStream Extended Purge| |[ADR-11](adr/ADR-11.md)|client|Hostname resolution| |[ADR-13](adr/ADR-13.md)|jetstream, client|Pull Subscribe internals| |[ADR-14](adr/ADR-14.md)|client, security|JWT library free jwt user generation| -|[ADR-15](adr/ADR-15.md)|jetstream, client|JetStream Subscribe Workflow| |[ADR-17](adr/ADR-17.md)|jetstream, client|Ordered Consumer| |[ADR-18](adr/ADR-18.md)|client|URL support for all client options| -|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views:| -|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore|JetStream based Object Stores| +|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views| +|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore, spec|JetStream based Object Stores| |[ADR-21](adr/ADR-21.md)|client|NATS Configuration Contexts| |[ADR-22](adr/ADR-22.md)|jetstream, client|JetStream Publish Retries on No Responders| |[ADR-31](adr/ADR-31.md)|jetstream, client, server|JetStream Direct Get| -|[ADR-32](adr/ADR-32.md)|client|Service API| +|[ADR-32](adr/ADR-32.md)|client, spec|Service API| |[ADR-33](adr/ADR-33.md)|jetstream, client, server|Metadata for Stream and Consumer| |[ADR-34](adr/ADR-34.md)|jetstream, client, server|JetStream Consumers Multiple Filters| |[ADR-36](adr/ADR-36.md)|jetstream, client, server|Subject Mapping Transforms in Streams| -|[ADR-37](adr/ADR-37.md)|jetstream, client|JetStream Simplification| -|[ADR-40](adr/ADR-40.md)|client|Request Many| +|[ADR-37](adr/ADR-37.md)|jetstream, client, spec|JetStream Simplification| +|[ADR-40](adr/ADR-40.md)|client, server, spec|NATS Connection| +|[ADR-43](adr/ADR-43.md)|jetstream, client, server|JetStream Per-Message TTL| ## Jetstream @@ -43,42 +43,45 @@ This repo is used to capture architectural and design decisions as a reference o |[ADR-1](adr/ADR-1.md)|jetstream, client, server|JetStream JSON API Design| |[ADR-2](adr/ADR-2.md)|jetstream, server, client|NATS Typed Messages| |[ADR-7](adr/ADR-7.md)|server, client, jetstream|NATS Server Error Codes| -|[ADR-8](adr/ADR-8.md)|jetstream, client, kv|JetStream based Key-Value Stores| +|[ADR-8](adr/ADR-8.md)|jetstream, client, kv, spec|JetStream based Key-Value Stores| |[ADR-9](adr/ADR-9.md)|server, client, jetstream|JetStream Consumer Idle Heartbeats| |[ADR-10](adr/ADR-10.md)|server, client, jetstream|JetStream Extended Purge| |[ADR-12](adr/ADR-12.md)|jetstream|JetStream Encryption At Rest| |[ADR-13](adr/ADR-13.md)|jetstream, client|Pull Subscribe internals| -|[ADR-15](adr/ADR-15.md)|jetstream, client|JetStream Subscribe Workflow| |[ADR-17](adr/ADR-17.md)|jetstream, client|Ordered Consumer| -|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views:| -|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore|JetStream based Object Stores| +|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views| +|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore, spec|JetStream based Object Stores| |[ADR-22](adr/ADR-22.md)|jetstream, client|JetStream Publish Retries on No Responders| |[ADR-28](adr/ADR-28.md)|jetstream, server|JetStream RePublish| |[ADR-31](adr/ADR-31.md)|jetstream, client, server|JetStream Direct Get| |[ADR-33](adr/ADR-33.md)|jetstream, client, server|Metadata for Stream and Consumer| |[ADR-34](adr/ADR-34.md)|jetstream, client, server|JetStream Consumers Multiple Filters| |[ADR-36](adr/ADR-36.md)|jetstream, client, server|Subject Mapping Transforms in Streams| -|[ADR-37](adr/ADR-37.md)|jetstream, client|JetStream Simplification| +|[ADR-37](adr/ADR-37.md)|jetstream, client, spec|JetStream Simplification| +|[ADR-42](adr/ADR-42.md)|jetstream, server|Pull Consumer Priority Groups| +|[ADR-43](adr/ADR-43.md)|jetstream, client, server|JetStream Per-Message TTL| +|[ADR-44](adr/ADR-44.md)|jetstream, server|Versioning for JetStream Assets| ## Kv |Index|Tags|Description| |-----|----|-----------| -|[ADR-8](adr/ADR-8.md)|jetstream, client, kv|JetStream based Key-Value Stores| -|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views:| +|[ADR-8](adr/ADR-8.md)|jetstream, client, kv, spec|JetStream based Key-Value Stores| +|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views| ## Objectstore |Index|Tags|Description| |-----|----|-----------| -|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views:| -|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore|JetStream based Object Stores| +|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views| +|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore, spec|JetStream based Object Stores| ## Observability |Index|Tags|Description| |-----|----|-----------| |[ADR-3](adr/ADR-3.md)|observability, server|NATS Service Latency Distributed Tracing Interoperability| +|[ADR-41](adr/ADR-41.md)|observability, server|NATS Message Path Tracing| ## Security @@ -110,32 +113,40 @@ This repo is used to capture architectural and design decisions as a reference o |[ADR-36](adr/ADR-36.md)|jetstream, client, server|Subject Mapping Transforms in Streams| |[ADR-38](adr/ADR-38.md)|server, security|OCSP Peer Verification| |[ADR-39](adr/ADR-39.md)|server, security|Certificate Store| +|[ADR-40](adr/ADR-40.md)|client, server, spec|NATS Connection| +|[ADR-41](adr/ADR-41.md)|observability, server|NATS Message Path Tracing| +|[ADR-42](adr/ADR-42.md)|jetstream, server|Pull Consumer Priority Groups| +|[ADR-43](adr/ADR-43.md)|jetstream, client, server|JetStream Per-Message TTL| +|[ADR-44](adr/ADR-44.md)|jetstream, server|Versioning for JetStream Assets| -## When to write an ADR +## Spec -Not every little decision needs an ADR, and we are not overly prescriptive about the format apart from the initial header format. -The kind of change that should have an ADR are ones likely to impact many client libraries, server configuration, security, deployment -and those where we specifically wish to solicit wider community input. +|Index|Tags|Description| +|-----|----|-----------| +|[ADR-8](adr/ADR-8.md)|jetstream, client, kv, spec|JetStream based Key-Value Stores| +|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore, spec|JetStream based Object Stores| +|[ADR-32](adr/ADR-32.md)|client, spec|Service API| +|[ADR-37](adr/ADR-37.md)|jetstream, client, spec|JetStream Simplification| +|[ADR-40](adr/ADR-40.md)|client, server, spec|NATS Connection| -For a background of the rationale driving ADRs see [Documenting Architecture Decisions](https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions) by -Michael Nygard +## Deprecated -## Template +|Index|Tags|Description| +|-----|----|-----------| +|[ADR-15](adr/ADR-15.md)|deprecated|JetStream Subscribe Workflow| -Please see the [template](adr-template.md). The template body is a guideline. Feel free to add sections as you feel appropriate. Look at the other ADRs for examples. However the initial Table of metadata and header format is required to match. +## When to write an ADR -After editing / adding a ADR please run `go run main.go > README.md` to update the embedded index. This will also validate the header part of your ADR. +We use this repository in a few ways: + + 1. Design specifications where a single document captures everything about a feature, examples are ADR-8, ADR-32, ADR-37 and ADR-40 + 1. Guidance on conventions and design such as ADR-6 which documents all the valid naming rules + 1. Capturing design that might impact many areas of the system such as ADR-2 -## Related Repositories +We want to move away from using these to document individual minor decisions, moving instead to spec like documents that are living documents and can change over time. Each capturing revisions and history. - * Server [nats-server](https://github.com/nats-io/nats-server) - * Reference implementation [nats.go](https://github.com/nats-io/nats.go) - * Java Client [nats.java](https://github.com/nats-io/nats..java) - * .NET / C# client [nats.net](https://github.com/nats-io/nats.net) - * JavaScript [nats.ws](https://github.com/nats-io/nats.ws) [nats.deno](https://github.com/nats-io/nats.deno) - * C Client [nats.c](https://github.com/nats-io/nats.c) - * Python3 Client for Asyncio [nats.py](https://github.com/nats-io/nats.py) +## Template -### Client Tracking +Please see the [template](adr-template.md). The template body is a guideline. Feel free to add sections as you feel appropriate. Look at the other ADRs for examples. However the initial Table of metadata and header format is required to match. -There is a [Client Feature Parity](https://docs.google.com/spreadsheets/d/1VcYcKqwOp8h8zZwNSRXMS5wrdA1jZz6AumMTHZbXrmY/edit#gid=1032495336) spreadsheet that tracks the clients somewhat, but it is not guaranteed to be complete or up to date. +After editing / adding a ADR please run `go run main.go > README.md` to update the embedded index. This will also validate the header part of your ADR. diff --git a/adr-template.md b/adr-template.md index 45349aa9..7f6b0235 100644 --- a/adr-template.md +++ b/adr-template.md @@ -1,11 +1,15 @@ # Title -|Metadata|Value| -|--------|-----| -|Date |YYYY-MM-DD| -|Author |@, @| -|Status |`Proposed`, `Approved` `Partially Implemented`, `Implemented`| -|Tags |jetstream, client, server| +| Metadata | Value | +|----------|-----------------------------------------------------------------| +| Date | YYYY-MM-DD | +| Author | @, @ | +| Status | `Approved` `Partially Implemented`, `Implemented`, `Deprecated` | +| Tags | jetstream, client, server | + +| Revision | Date | Author | Info | +|----------|------------|---------|----------------| +| 1 | YYYY-MM-DD | @author | Initial design | ## Context and Problem Statement diff --git a/adr/ADR-15.md b/adr/ADR-15.md index 0d514fa0..2d7f4d4b 100644 --- a/adr/ADR-15.md +++ b/adr/ADR-15.md @@ -1,11 +1,17 @@ # JetStream Subscribe Workflow -|Metadata|Value| -|--------|-----| -|Date |2021-08-11| -|Author |@kozlovic| -|Status |Partially Implemented| -|Tags |jetstream, client| +| Metadata | Value | +|----------|------------| +| Date | 2021-08-11 | +| Author | @kozlovic | +| Status | Deprecated | +| Tags | deprecated | + +## Revision History + +| Revision | Date | Description | +|----------|------------|------------------------------| +| final | 2024/07/17 | Deprecate in favor of ADR-37 | ## Context diff --git a/adr/ADR-19.md b/adr/ADR-19.md index cc6ea147..b1f28557 100644 --- a/adr/ADR-19.md +++ b/adr/ADR-19.md @@ -1,4 +1,4 @@ -# API prefixes for materialized JetStream views: +# API prefixes for materialized JetStream views |Metadata|Value| |--------|-----| @@ -9,20 +9,20 @@ ## Context -This document describes a design on how to support API prefixes for materialized JS views. +This document describes a design on how to support API prefixes for materialized JS views. API prefixes allow the client library to disambiguate access to independent JetStreams that run either in different domains or different accounts. By specifying the prefix in the API, a client program can essentially pick which one it wants to communicate with. This mechanism needs to be supported for materialized JS views as well. -## Overview +## Overview Each JetStream only listens to default API subjects with the prefix `$JS.API`. Thus, when the client uses `$JS.API`, it communicates with the JetStream, local to its account and domain. To avoid traffic going to some other JetStream the following mechanisms are in place: -1. Account: Since the API has to be imported with an altered prefix, the request will not cross account boundaries. - In a JetStream enabled account, an import without a prefix set will result in an error as JetStream is imported as well and we error on import overlaps. +1. Account: Since the API has to be imported with an altered prefix, the request will not cross account boundaries. + In a JetStream enabled account, an import without a prefix set will result in an error as JetStream is imported as well and we error on import overlaps. 2. Domain: On leaf node connections, the nats server adds in denies for `$JS.API.>`. When the client library sets an API prefix, all API subjects, the client publishes and subscribes to start with that instead of `$JS.API`. @@ -33,16 +33,16 @@ As messages and subscriptions cross boundaries the following happens: 2. Domain: When crossing the leaf node connection between domains, an automatically inserted mapping strips the API prefix and replaces it with `$JS.API`. This JetStream disambiguation mechanism needs to be added to materialized views such as KV and object store as well. -Specifically, we need to tag along on the same API prefix. +Specifically, we need to tag along on the same API prefix. Setting different values to reach the same JetStream is a non starter. ## Design The first token of any API or materialized view is considered the default API prefix and will mean have local semantics. -Thus, for the concrete views we treat the tokens `$KV` and `$OBJ` as the respective default API prefixes. -For publishes and subscribes the API just replaces the first token with the specified (non default) prefix. +Thus, for the concrete views we treat the tokens `$KV` and `$OBJ` as the respective default API prefixes. +For publishes and subscribes the API just replaces the first token with the specified (non default) prefix. -To share API access across accounts this is sufficient as account export/import takes care of the rest. +To share API access across accounts this is sufficient as account export/import takes care of the rest. To access an API in the same account but different domain, the `nats-server` maintaining a leaf node connection will add in the appropriate mappings from domain specific API to local API and deny local API traffic. ## KV Example @@ -52,21 +52,21 @@ For JetStream specific API calls, the local API prefix `$JS.API` has to be repla Because the JetStream specified API prefix differs from `$JS.API`, the KV API uses the same prefix as is specififed for JetStream API. -For the KV API we prefix `$KV` with `JS.from-acc1`, resulting in `JS.from-acc1.$KV`. -Thus, in order to put `key` in the bin `bin`, we send to `JS.from-acc1.$KV.bin.key` +For the KV API we prefix `$KV` with `JS.from-acc1`, resulting in `JS.from-acc1.$KV`. +Thus, in order to put `key` in the bin `bin`, we send to `JS.from-acc1.$KV.bin.key` When crossing the account boundaries, this is then translated back to `$KV.bin.key`. -Thus, the underlying stream still needs to be created with a subscription to `$KV.bin.key`. +Thus, the underlying stream still needs to be created with a subscription to `$KV.bin.key`. Domains are just a special case of API prefixes and will work the same way. The API prefix `$JS..API` will lead to `$JS..API.$KV.bin.key`. As the leaf node connection into the domain is crossed, the inserted mapping will changes the subject back to `$KV.bin.key`. -## Consequences +## Consequences -The proposed change is backwards compatible with materialized views already in use. +The proposed change is backwards compatible with materialized views already in use. -I suggested prefixing so we can have one prefix across different APIs. +I suggested prefixing so we can have one prefix across different APIs. To avoid accidental API overlaps going forward, the implication for JetStream is to NOT start the first token after the API prefix with `$`. Specifically, JetStream will never expose any functionality under `$JS.API.$KV.>`. @@ -78,7 +78,7 @@ This however will not be backwards compatible. ## Testing -Here is a server config to test your changes. +Here is a server config to test your changes. * The JetStream prefix to use is `fromA` * The inbox prefix to use is `forI` diff --git a/adr/ADR-20.md b/adr/ADR-20.md index 2369a40c..192bd7bc 100644 --- a/adr/ADR-20.md +++ b/adr/ADR-20.md @@ -5,13 +5,14 @@ |Date |2021-11-03| |Author |@scottf| |Status |Partially Implemented| -|Tags |jetstream, client, objectstore| +|Tags |jetstream, client, objectstore, spec| |Revision|Date|Author|Info| |--------|----|------|----| |1 |2021-11-03|@scottf|Initial design| |2 |2023-06-14|@Jarema|Add metadata| +|3 |2024-02-05|@Jarema|Add Compression| ## Context @@ -21,16 +22,17 @@ This document describes a design of a JetStream backed object store. This ADR is We intend to hit a basic initial feature set as below, with some future facing goals as indicated: -Initial feature list: +Current feature list: - Represent an object store. - Store a large quantity of related bytes in chunks as a single object. - Retrieve all the bytes from a single object - Store metadata regarding each object -- Store multiple objects in a single store +- Store multiple objects in a single store - Ability to specify chunk size - Ability to delete an object -- Ability to understand the state of the object store. +- Ability to understand the state of the object store +- Data Compression of Object Stores for NATS Server 2.10 Possible future features @@ -39,10 +41,10 @@ Possible future features - Archiving (tiered storage) - Searching/Indexing (tagging) - Versioning / Revisions -- Overriding digest algorithm +- Overriding digest algorithm - Capturing Content-Type (mime type) - Per chunk Content-Encoding (i.e. gzip) -- Read an individual chunk. +- Read an individual chunk. ## Basic Design @@ -57,7 +59,7 @@ Possible future features Protocol Naming Conventions are fully defined in [ADR-6](ADR-6.md) ### Object Store -The object store name or bucket name (`bucket`) will be used to formulate a stream name +The object store name or bucket name (`bucket`) will be used to formulate a stream name and is specified as: `restricted-term` (1 or more of `A-Z, a-z, 0-9, dash, underscore`) ### Object Id @@ -71,9 +73,9 @@ Currently `SHA-256` is the only supported digest. Please use the uppercase form when specifying the digest as in `SHA-256=IdgP4UYMGt47rgecOqFoLrd24AXukHf5-SVzqQ5Psg8=`. ### Modified Time -Modified time is never stored. +Modified time is never stored. * When putting an object or link into the store, the client should populate the ModTime with the current UTC time before returning it to the user. -* When getting an object or getting an object or link's info, the client should populate the ModTime with message time from the server. +* When getting an object or getting an object or link's info, the client should populate the ModTime with message time from the server. ### Default Settings @@ -98,9 +100,13 @@ type ObjectStoreConfig struct { Storage StorageType // stream storate_type Replicas int // stream replicas Placement Placement // stream placement + Compression bool // stream compression } ``` +* If Compression is requested in the configuration, set its value in the Stream config to `s2`. +Object Store does not expose internals of Stream config, therefore the bool value is used. + ### Stream Configuration and Subject Templates | Component | Template | @@ -132,7 +138,8 @@ type ObjectStoreConfig struct { "placement": { "cluster": "clstr", "tags": ["tag1", "tag2"] - } + }, + compression: "s2" } ``` @@ -144,7 +151,7 @@ type ObjectStoreConfig struct { type ObjectLink struct { // Bucket is the name of the other object store. Bucket string `json:"bucket"` - + // Name can be used to link to a single object. // If empty means this is a link to the whole store, like a directory. Name string `json:"name,omitempty"` @@ -160,7 +167,7 @@ type ObjectMetaOptions struct { } ``` -### ObjectMeta +### ObjectMeta Object Meta is high level information about an object. @@ -176,31 +183,24 @@ type ObjectMeta struct { } ``` -### ObjectInfo +### ObjectInfo -Object Info is meta plus instance information. -The fields in ObjectMeta are serialized in line as if they were -direct fields of ObjectInfo +Object Info is meta plus instance information. +The fields in ObjectMeta are serialized in line as if they were +direct fields of ObjectInfo ```go type ObjectInfo struct { ObjectMeta - Bucket string `json:"bucket"` - NUID string `json:"nuid"` - // the total object size in bytes Size uint64 `json:"size"` - ModTime time.Time `json:"mtime"` - // the total number of chunks Chunks uint32 `json:"chunks"` - // as in http, = Digest string `json:"digest,omitempty"` - Deleted bool `json:"deleted,omitempty"` } ``` @@ -248,37 +248,43 @@ The status of an object type ObjectStoreStatus interface { // Bucket is the name of the bucket Bucket() string - + // Description is the description supplied when creating the bucket Description() string // Bucket-level metadata Metadata() map[string]string - + // TTL indicates how long objects are kept in the bucket TTL() time.Duration - + // Storage indicates the underlying JetStream storage technology used to store data Storage() StorageType - + // Replicas indicates how many storage replicas are kept for the data in the bucket Replicas() int - + // Sealed indicates the stream is sealed and cannot be modified in any way Sealed() bool - + // Size is the combined size of all data in the bucket including metadata, in bytes Size() uint64 - - // BackingStore provides details about the underlying storage. + + // BackingStore provides details about the underlying storage. // Currently the only supported value is `JetStream` BackingStore() string -} + + // IsCompressed indicates if the data is compressed on disk + IsCompressed() bool +} ``` +The choice of `IsCompressed()` as a method name is idiomatic for Go, language maintainers can make a similar idiomatic +choice. + ## Functional Interfaces -### ObjectStoreManager +### ObjectStoreManager Object Store Manager creates, loads and deletes Object Stores @@ -295,7 +301,7 @@ CreateObjectStore(cfg ObjectStoreConfig) -> ObjectStore DeleteObjectStore(bucket string) ``` -### ObjectStore +### ObjectStore Storing large objects efficiently. API are required unless noted as "Optional/Convenience". @@ -320,7 +326,7 @@ PutFile(file [string/file reference]) -> ObjectInfo _Notes_ On convenience methods accepting file information only, consider that the reference could have -operating specific path information that is not transferable. One solution would be to only +operating specific path information that is not transferable. One solution would be to only use the actual file name as the object name and discard any path information. **Get** @@ -347,8 +353,8 @@ GetFile(name string, file string) **GetInfo** -GetInfo will retrieve the current information for the object. -* Do not return info for deleted objects, except with optional convenience methods. +GetInfo will retrieve the current information for the object. +* Do not return info for deleted objects, except with optional convenience methods. ``` GetInfo(name string) -> ObjectInfo @@ -424,11 +430,11 @@ Status() -> ObjectStoreStatus ### ObjectStore Links -Links are currently under discussion whether they are necessary. +Links are currently under discussion whether they are necessary. Here is the required API as proposed. -Please note that in this version of the api, it is possible that +Please note that in this version of the api, it is possible that `obj ObjectInfo` or `bucket ObjectStore` could be stale, meaning their state -has changed since they were read, i.e. the object was deleted after it's info was read. +has changed since they were read, i.e. the object was deleted after it's info was read. **AddLink** diff --git a/adr/ADR-28.md b/adr/ADR-28.md index c6dd16d1..0bf9b117 100644 --- a/adr/ADR-28.md +++ b/adr/ADR-28.md @@ -7,6 +7,11 @@ | Status | Implemented | | Tags | jetstream, server | +## Update History +| Date | Author | Description | +|------------|---------|----------------------------------------------------| +| 2023-06-27 | @tbeets | Fix typo on JSON boolean in `headers_only` example | + ## Context and Problem Statement In some use cases it is useful for a subscriber to monitor messages that have been ingested by a stream (captured to @@ -55,7 +60,7 @@ Here is an example of a stream configuration with the RePublish option specified "republish": { "src": "one.>", "dest": "uno.>", - "headers_only": "false" + "headers_only": false }, "retention": "limits", ... omitted ... diff --git a/adr/ADR-3.md b/adr/ADR-3.md index 07d54f2c..19f10c32 100644 --- a/adr/ADR-3.md +++ b/adr/ADR-3.md @@ -1,19 +1,19 @@ # NATS Service Latency Distributed Tracing Interoperability -|Metadata|Value| -|--------|-----| -|Date |2020-05-21| -|Author |@ripienaar| -|Status |Approved| -|Tags |observability, server| +| Metadata | Value | +|----------|-----------------------| +| Date | 2020-05-21 | +| Author | @ripienaar | +| Status | Approved | +| Tags | observability, server | ## Context The goal is to enable the NATS internal latencies to be exported to distributed tracing systems, here we see a small architecture using Traefik, a Go microservice and a NATS hosted service all being observed in Jaeger. -![Jaeger](0003-jaeger-trace.png) +![Jaeger](images/0003-jaeger-trace.png) The lowest 3 spans were created from a NATS latency Advisory. diff --git a/adr/ADR-31.md b/adr/ADR-31.md index 426712d5..f0112062 100644 --- a/adr/ADR-31.md +++ b/adr/ADR-31.md @@ -1,11 +1,16 @@ # JetStream Direct Get -| Metadata | Value | -|----------|-----------------------------------------------| -| Date | 2022-08-03 | -| Author | @mh, @ivan, @derekcollison, @alberto, @tbeets | -| Status | Implemented | -| Tags | jetstream, client, server | +| Metadata | Value | +|----------|-----------------------------------------------------------| +| Date | 2022-08-03 | +| Author | @mh, @ivan, @derekcollison, @alberto, @tbeets, @ripienaar | +| Status | Implemented | +| Tags | jetstream, client, server | + +| Revision | Date | Author | Info | +|----------|------------|------------|------------------------------------------------| +| 1 | 2022-08-08 | @tbeets | Initial design | +| 2 | 2024-03-06 | @ripienaar | Adds Multi and Batch behaviors for Server 2.11 | ## Context and motivation @@ -70,20 +75,27 @@ When Allow Direct is true, each of the stream's servers configures a responder a Clients may make requests with the same payload as the Get message API populating the following server struct: ```text -Seq uint64 `json:"seq,omitempty"` -LastFor string `json:"last_by_subj,omitempty"` -NextFor string `json:"next_by_subj,omitempty"` +Seq uint64 `json:"seq,omitempty"` +LastFor string `json:"last_by_subj,omitempty"` +NextFor string `json:"next_by_subj,omitempty"` +Batch int `json:"batch,omitempty"` +MaxBytes int `json:"max_bytes,omitempty"` +StartTime *time.Time `json:"start_time,omitempty"` ``` Example request payloads: -* `{last_by_subj: string}` - get the last message having the subject + * `{seq: number}` - get a message by sequence +* `{last_by_subj: string}` - get the last message having the subject * `{next_by_subj: string}` - get the first message (lowest seq) having the specified subject +* `{start_time: string}` - get the first message at or newer than the time specified in RFC 3339 format (since server 2.11) * `{seq: number, next_by_subj: string}` - get the first message with a seq >= to the input seq that has the specified subject +* `{seq: number, batch: number, next_by_subj: string}` - gets up to batch number of messages >= than seq that has specified subject +* `{seq: number, batch: number, next_by_subj: string, max_bytes: number}` - as above but limited to a maximum size of messages received in bytes #### Subject-Appended Direct Get API -The purpose of this form is so that environmnents may choose to apply subject-based interest restrictions (user permissions +The purpose of this form is so that environments may choose to apply subject-based interest restrictions (user permissions within an account and/or cross-account export/import grants) such that only specific subjects in stream may be read (vs any subject in the stream). @@ -93,21 +105,66 @@ where `last_by_subj` is derived by the token (or series of tokens) following the It is an error (408) if a client calls Subject-Appended Direct Get and includes a request payload. +#### Batched requests + +Using the `batch` and `max_bytes` keys one can request multiple messages in a single API call. + +The server will send multiple messages without any flow control to the reply subject, it will send up to `max_bytes` messages. When `max_bytes` is unset the server will use the `max_pending` configuration setting or the server default (currently 64MB) + +After the batch is sent a zero length payload message will be sent with the `Nats-Num-Pending` and `Nats-Last-Sequence` headers set that clients can use to determine if further batch calls are needed. It would also have the `Status` header set to `204` with the `Description` header being `EOB`. + +When requests are made against servers that do not support `batch` the first response will be received and nothing will follow. Old servers can be detected by the absence of the `Nats-Num-Pending` header in the first reply. + +#### Multi-subject requests + +Multiple subjects can be requested in the same manner that a Batch can be requested. In this mode we support consistent point in time reads by allowing for a group of subjects to be read as they were at a point in time - assuming the stream holds enough historical data. + +To help inform proper use of this feature vs just using a consumer, any multi-subject request may only allow matching up to 1024 subjects. Any more will result in a `413` status reply. + +Using requests like `{"multi_last":["$KV.USERS.1234.>"]}` all the latest values for all subjects below that wildcard will be returned. + +Specific data for a user could be requested using `{"multi_last":["$KV.USERS.1234.name", "$KV.USERS.1234.address"]}`. Rather than getting all the user data, only values for for two specific keys will be returned. + +To facilitate consistent multi key reads, the `up_to_seq` and `up_to_time` keys can be added which will restrict the results up to a certain point in time. + +Imagine we have a new bucket and we did: + +``` +$ nats kv put USERS 1234.name Bob # message seq 1 +$ nats kv put USERS 1234.surname Smith # message seq 2 +$ nats kv put USERS 1234.address 1 Main Street # message seq 3 +$ nats kv put USERS 1234.address 10 Oak Lane # message seq 4 updates message 3 +``` + +If we did a normal multi read using `{"multi_last":["$KV.USERS.1234.>"]}` we would get the address `10 Oak Lane` returned. However, to get a record as it was at a certain point in the past we could supply the sequence or time, adding `"up_to_seq":3` to the request will return address `1 Main Street` along with the other data. Likewise, assuming there was a noticeable gap of time changing addresses, the `up_to_time` could be used as a form of temporal querying. + +A `batch` parameter can be added to restrict the result set to a certain size, otherwise the server will decide when to end the batch using the same `EOB` marker message seen in Batched Mode with the addition of the `Nats-UpTo-Sequence` header. + +When the server cannot send any more data it will respond, like the above Batch, with a zero-length payload message including the `Nats-Num-Pending` and `Nats-Last-Sequence` headers enabling clients to determine if further batch calls are needed. In addition, it would also have the `Status` header set to `204` with the `Description` header being `EOB`. The `Nats-UpTo-Sequence` header will be set indicating the last message in the stream that matched criteria. This number would be used in subsequent requests as the `up_to_seq` value to ensure batches of multi-gets are done around a consistent point in time. + #### Response Format -Direct Get may return an error code: +Responses may include these status codes: + +- `204` indicates the the end of a batch of messages, the description header would have value `EOB` - `404` if the request is valid but no matching message found in stream - `408` if the request is empty or invalid +- `413` when a multi subject get matches too many subjects -> Error code is returned as a header, e.g. `NATS/1.0 408 Bad Request`. Success returned as `NATS/1.0` with no code. +Error code is returned as a header, e.g. `NATS/1.0 408 Bad Request`. Success returned as `NATS/1.0` with no code. + +Direct Get replies contain the message along with the following message headers: -Otherwise, Direct Get reply contains the message along with the following message headers: - `Nats-Stream`: stream name - `Nats-Sequence`: message sequence number - `Nats-Time-Stamp`: message publish timestamp - `Nats-Subject`: message subject +- `Nats-Num-Pending`: when batched, the number of messages left in the stream matching the batch parameters +- `Nats-Last-Sequence`: when batched, the stream sequence of the previous message +- `Nats-UpTo-Sequence`: when doing multi subject gets the sequence should be used for following requests to ensure consistent reads + +A _regular_ (not JSON-encoded) NATS message is returned (from the stream store). -> A _regular_ (not JSON-encoded) NATS message is returned (from the stream store). ## Example calls @@ -180,4 +237,4 @@ Reply: ```text HMSG _INBOX.4NogKOPzKEWTqhf4hFIUJV.yo2tE1ep 1 28 28 NATS/1.0 408 Bad Request -``` \ No newline at end of file +``` diff --git a/adr/ADR-32.md b/adr/ADR-32.md index c8af54b3..9372b051 100644 --- a/adr/ADR-32.md +++ b/adr/ADR-32.md @@ -1,11 +1,21 @@ # Service API -| Metadata | Value | -| -------- | --------------------- | -| Date | 2022-11-23 | -| Author | @aricart | -| Status | Partially Implemented | -| Tags | client | +| Metadata | Value | +|----------|--------------| +| Date | 2022-11-23 | +| Author | @aricart | +| Status | Implemented | +| Tags | client, spec | + +## Release History + +| Revision | Date | Description | +|----------|------------|-------------------------------------------| +| 1 | 2022-11-23 | Initial release | +| 2 | 2023-09-12 | Configurable queue group | +| 3 | 2023-10-07 | Add version regex info | +| 4 | 2023-11-10 | Explicit naming | +| 5 | 2024-08-08 | Optional queue groups, immutable metadata | ## Context and Problem Statement @@ -23,14 +33,17 @@ Service configuration relies on the following: - `name` - really the _kind_ of the service. Shared by all the services that have the same name. This `name` can only have `A-Z, a-z, 0-9, dash, underscore`. -- `version` - a SemVer string - impl should validate that this is SemVer +- `version` - a SemVer string - impl should validate that this is SemVer. + One of the [official semver](https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string) + regex should be used. - `description` - a human-readable description about the service (optional) - `metadata` - (optional) an object of strings holding free form metadata about the deployed instance implemented consistently with - [Metadata for Stream and Consumer ADR-33](ADR-33.md). + [Metadata for Stream and Consumer ADR-33](ADR-33.md). Must be immutable once set. - `statsHandler` - an optional function that returns unknown data that can be serialized as JSON. The handler will be provided the endpoint for which it is building a `EndpointStats` +- `queueGroup` - overrides a default queue group. All services are created using a function called `addService()` where the above options are passed. The function returns an object/struct that represents the @@ -107,7 +120,7 @@ name: string, id: string, /** * The version of the service -* Should be validated using official semver regexp: +* Should be validated using official semver regexp: * https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string */ version: string @@ -142,8 +155,18 @@ Returns a JSON having the following structure: ```typescript // EndpointInfo { + /** + * The name of the endpoint + */ name: string, + /** + * The subject on which the endpoint is listening. + */ subject: string, + /** + * Queue group to which this endpoint is assigned to + */ + queue_group: string, /** * Metadata of a specific endpoint */ @@ -182,7 +205,7 @@ The type for this is `io.nats.micro.v1.ping_response`. type: string, name: string, id: string, - version: string, + version: string, metadata: Record, /** * Individual endpoint stats @@ -196,17 +219,21 @@ The type for this is `io.nats.micro.v1.ping_response`. /** * EndpointStats - */ + */ { /** * The name of the endpoint */ name: string; /** - * The subject on which the endpoint is registered + * The subject on which the endpoint is listening. */ subject: string; /** + * Queue group to which this endpoint is assigned to + */ + queue_group: string; + /** * The number of requests received by the endpoint */ num_requests: number; @@ -245,6 +272,7 @@ A group serves as a common prefix to all endpoints registered in it. A group can be created using `addGroup(name)` method on a Service. Group name should be a valid NATS subject or an empty string, but cannot contain `>` wildcard (as group name serves as subject prefix). +Group can have a default `queueGroup` for endpoints that overrides service `queueGroup`. Group should expose following methods: @@ -264,14 +292,18 @@ Each service endpoint consists of the following fields: Multiple endpoints can have the same names. - `handler` - request handler - see [Request Handling](#Request-Handling) - `metadata` - an optional `Record` providing additional - information about the endpoint + information about the endpoint. Must be immutable once set. - `subject` - an optional NATS subject on which the endpoint will be registered. A subject is created by concatenating the subject provided by the user with group prefix (if applicable). If subject is not provided, use `name` instead. +- `queueGroup` - optional override for a service and group `queueGroup`. -Enpoints can be created either on the service directly (`Service.addEndpoint()`) +Endpoints can be created either on the service directly (`Service.addEndpoint()`) or on a group (`Group.addEndpoint`). +Clients should provide an idiomatic way to set no `queueGroup` when unset the subscription +for the endpoint will be a normal subscribe instead of a queue subscribe. + ## Error Handling Services may communicate request errors back to the client as they see fit, but @@ -304,13 +336,13 @@ of `respond()`. ## Request Handling -All service request handlers operate under the queue group `q`. This means that +All service request handlers operate under the default queue group `q`. This means that in order to scale up or down all the user needs to do is add or stop services. -Note the name of the queue group is fixed to `q` and cannot be changed otherwise -different implementations on different queue groups will respond to the same -request. +Its possible to send request to multiple services, for example to minimize response time by using +the quickest responder. To achieve that, it requires running some service instances with different `queueGroup`. -For each configured endpoint, a queue subscription should be created. +For each configured endpoint, a queue subscription should be created. Unless the option to create +a normal enqueued subscription is activated. > Note: Handler subject does not contain the `$SRV` prefix. This prefix is > reserved for internal handlers. @@ -319,3 +351,9 @@ The handlers specified by the client to process requests should operate as any standard subscription handler. This means that no assumption is made on whether returning from the callback signals that the request is completed. The framework will dispatch requests as fast as the handler returns. + +### Naming + +For consistency of documentation and understanding by users, clients that implement the +service API and tooling that interacts with services should use the term "service" or +"services". diff --git a/adr/ADR-36.md b/adr/ADR-36.md index 9d8332b9..3db605d2 100644 --- a/adr/ADR-36.md +++ b/adr/ADR-36.md @@ -19,20 +19,98 @@ See [ADR-30](ADR-30.md) for Core NATS subject mapping and a description of the a ## Features introduced -The new features introduced by the [PR](https://github.com/nats-io/nats-server/pull/3814) allow the application of subject mapping transformations in two places in the space configuration: +The new features introduced by version 2.10 of the NATS server allow the application of subject mapping transformations in multiple places in the stream configuration: +- You can apply a subject mapping transformation as part of a Stream mirror. - You can apply a subject mapping transformation as part of a Stream source. - Amongst other use cases, this enables the ability to do sourcing between KV bucket (as the name of the bucket is part of the subject name in the KV bucket streams, and therefore has to be transformed during the sourcing as the name of the sourcing bucket is different from the name(s) of the bucket(s) being sourced). - You can apply a subject mapping transformation at the ingres (input) of the stream, meaning after it's been received on Core NATS, or mirrored or sourced from another stream, and before limits are applied (and it gets persisted). This subject mapping transformation is only that, it does not filter messages, it only transforms the subjects of the messages matching the subject mapping source. - This enables the ability to insert a partition number as a token in the message subjects. +- You can also apply a subject mapping transformation as part of the re-publishing of messages. + +Subject mapping transformation can be seen as an extension of subject filtering, there can not be any subject mapping transformation without an associated subject filter. + +A subject filtering and mapping transform is composed of two parts: a subject filter (the 'source' part of the transform) and the destination transform (the 'destination' part of the transform). An empty (i.e. `""`) destination transform means _NO transformation_ of the subject. ![](images/stream-transform.png) -In addition, it is now possible to source from the same stream more than once. +Just like streams and consumers can now have more than one single subject filter, mirror and sources can have more than one set of subject filter and transform destination. + +Just like with consumers you can either specify a single subject filter and optional subject transform destination or an array of subject transform configs composed of a source filter and optionally empty transform destination. + +In addition, it is now possible to source not just from different streams but also from the same stream more than once. + +If you define a single source with multiple subject filters and transforms, in which case the ordering of the messages is guaranteed to be preserved, there can not be any overlap between the filters. If you define multiple sources from the same stream, subject filters can overlap between sources thereby making it possible to duplicate messages from the sourced stream, but the order of the messages between the sources is not guaranteed to be preserved. + +For example if a stream contains messages on subjects "foo", "bar" and "baz" and you want to source only "foo" and "bar" from that stream you could specify two subject transforms (with an empty destination) in a single source, or you can source twice from that stream once with the "foo" subject filter and a second time with the "bar" subject filter. + +## Stream config structure changes + +From the user's perspective these features manifest themselves as new fields in the Stream Configuration request and Stream Info response messages. + +In Mirror and Sources : +- Additional `"subject_transforms"` array in the `"sources"` array and in `"mirror"` containing objects made of two string fields: `"src"` and `"dest"`. Note that if you use the `"subject_transforms"` array then you can _NOT_ also use the single string subject filters. The `"dest"` can be empty or `""` in which case there is no transformation, just filtering. + +At the top level of the Stream Config: +- Additional `"subject_transform"` field in Stream Config containing two strings: `"src"` and `"dest"`. + +## KV bucket sourcing + +Subject transforms in streams open up the ability to do sourcing between KV buckets. The client library implements this by automatically adding the subject transform to the source configuration of the underlying stream for the bucket doing the sourcing. + +The transform in question should map the subject names from the sourced bucket name to the sourcing's bucket name. + +e.g. if bucket B sources A the transform config for the source from stream A in stream B should have the following transform in the SubjectTransforms array for that StreamSource: +``` +{ + "src": "$KV.A.>", + "dest": "$KV.B.>" +} +``` + +## Examples + +A stream that mirrors the `sourcedstream` stream using two subject filters and transform (in this example `foo` is transformed, but `bar` is not): + +```JSON +{ + "name": "sourcingstream", + "retention": "limits", + "max_consumers": -1, + "max_msgs_per_subject": -1, + "max_msgs": -1, + "max_bytes": -1, + "max_age": 0, + "max_msg_size": -1, + "storage": "file", + "discard": "old", + "num_replicas": 1, + "duplicate_window": 120000000000, + "mirror": + { + "name": "sourcedstream", + "subject_transforms": [ + { + "src": "foo", + "dest": "foo-transformed" + }, + { + "src": "bar", + "dest": "" + } + ] + }, + "sealed": false, + "deny_delete": false, + "deny_purge": false, + "allow_rollup_hdrs": false, + "allow_direct": false, + "mirror_direct": false +} +``` -For example if a stream contains messages on subjects "foo", "bar" and "baz" and you want to source only "foo" and "bar" from that stream you could stream twice from that stream once with the "foo" subject filter and a second time with the "bar" subject filter. +A stream that sources from the `sourcedstream` stream twice, each time using a single subject filter and transform: -E.g. ```JSON { "name": "sourcingstream", @@ -50,11 +128,21 @@ E.g. "sources": [ { "name": "sourcedstream", - "filter_subject": "foo" + "subject_transforms": [ + { + "src": "foo", + "dest": "foo-transformed" + } + ] }, { "name": "sourcedstream", - "filter_subject": "bar" + "subject_transforms": [ + { + "src": "bar", + "dest": "bar-transformed" + } + ] } ], "sealed": false, @@ -66,12 +154,8 @@ E.g. } ``` -From the user's perspective these features manifest themselves as new fields in the Stream Configuration request and Stream Info response messages. - -- Additional `"subject_transform_dest"` field in objects in the `"sources"` array of the Stream Config -- Additional `"subject_transform"` field in Stream Config containing two strings: `"src"` and `"dest"` +A Stream that sources from 2 streams and has a subject transform: -E.g. ```JSON { "name": "foo", @@ -89,23 +173,24 @@ E.g. "sources": [ { "name": "source1", - "filter_subject": "stream1.foo.>", - "subject_transform_dest": "foo.>" + "filter_subject": "stream1.foo.>" }, { "name": "source1", - "filter_subject": "stream1.bar.>", - "subject_transform_dest": "bar.>" - }, - { - "name": "source2", - "filter_subject": "stream2.foo.>", - "subject_transform_dest": "foo.>" + "filter_subject": "stream1.bar.>" }, { "name": "source2", - "filter_subject": "stream2.bar.>", - "subject_transform_dest": "bar.>" + "subject_transforms": [ + { + "src": "stream2.foo.>", + "dest": "foo2.>" + }, + { + "src": "stream2.bar.>", + "dest": "bar2.>" + } + ] } ], "subject_transform": { @@ -122,6 +207,6 @@ E.g. ``` ## Client implementation PRs -- [jsm.go](https://github.com/nats-io/jsm.go/pull/436) -- [nats.go](https://github.com/nats-io/nats.go/pull/1200) -- [natscli](https://github.com/nats-io/natscli/pull/695) \ No newline at end of file +- [jsm.go](https://github.com/nats-io/jsm.go/pull/436) [and](https://github.com/nats-io/jsm.go/pull/461) +- [nats.go](https://github.com/nats-io/nats.go/pull/1200) [and](https://github.com/nats-io/nats.go/pull/1359) +- [natscli](https://github.com/nats-io/natscli/pull/695) [and](https://github.com/nats-io/natscli/pull/845) \ No newline at end of file diff --git a/adr/ADR-37.md b/adr/ADR-37.md index c18ad988..16b09894 100644 --- a/adr/ADR-37.md +++ b/adr/ADR-37.md @@ -5,7 +5,14 @@ | Date | 2022-11-23 | | Author | @aricart, @derekcollison, @tbeets, @scottf, @Jarema, @piotrpio | | Status | Approved | -| Tags | jetstream, client | +| Tags | jetstream, client, spec | + +## Release History + +| Revision | Date | Description | +|----------|------------|-----------------------------------------------------| +| 1 | 2023-05-30 | Initial stable release | +| 2 | 2024-06-07 | Change server reconnect behavior during `consume()` | ## Context and Problem Statement @@ -46,8 +53,10 @@ Example set of methods on JetStreamContext: - `deleteStream(streamName)` - `listStreams()` - Consumer operations: - - `addConsumer(streamName, consumerConfig)` - `getConsumer(streamName, consumerName)` + - `createConsumer(streamName, consumerConfig)` + - `updateConsumer(streamName, consumerConfig)` + - `createOrUpdateConsumer(streamName, consumerConfig)` - `deleteConsumer(streamName, consumerName)` - `accountInfo()` @@ -61,8 +70,10 @@ messages. Streams also allow for and managing consumers. Example set of methods on Stream: - operations on consumers: - - `addConsumer(consumerConfig)` - `getConsumer(consumerName)` + - `createConsumer(consumerConfig)` + - `updateConsumer(consumerConfig)` + - `createOrUpdateConsumer(consumerConfig)` - `deleteConsumer(consumerName)` - operations a stream: - `purge(purgeOpts)` @@ -288,6 +299,10 @@ Not Telegraphed: - 408 Request Timeout - 409 Message Size Exceeds MaxBytes +Calls to `next()` and `fetch()` should be concluded when the pull is terminated. On the other hand `consume()` should recover +while maintaing its state (e.g. pending counts) by issuing a new pull request unless the status is `409 Consumer Deleted` or `409 Consumer is push based` in +which case `consume()` call should conclude in an implementation specific way idiomatic to the language being used. + ###### Idle heartbeats `Consume()` should always utilize idle heartbeats. Heartbeat values are calculated as follows: @@ -305,16 +320,17 @@ Clients should detect server disconnect and reconnect. When a disconnect event is received, client should: -- Reset the heartbeat timer. - Pause the heartbeat timer. +- Stop publishing new pull requests. When a reconnect event is received, client should: -- Resume the heartbeat timer. -- Check if consumer exists (fetch consumer info). If consumer is not available, terminate `Consume()` execution with error. -This operation may have to be retried several times as JetStream may not be immediately available. +- Reset the heartbeat timer. - Publish a new pull request. +Clients should never terminate the `Consume()` call on disconnect and reconnect +events and should not check if consumer is still available after reconnect. + ###### Message processing algorithm Below is the algorithm for receiving and processing messages. @@ -339,7 +355,7 @@ was described in previous sections. 6. Verify error type: - if message contains `Nats-Pending-Messages` and `Nats-Pending-Bytes` headers, go to #7 - verify if error should be terminal based on [Status handling](#status-handling), - then issue a warning/error (if required) and terminate if necessary. + then issue a warning/error (if required) and conclude the call if necessary. 7. Read the values of `Nats-Pending-Messages` and `Nats-Pending-Bytes` headers. 8. Subtract the values from pending messages count and pending bytes count respectively. 9. Go to #1. diff --git a/adr/ADR-40.md b/adr/ADR-40.md new file mode 100644 index 00000000..0c8918ae --- /dev/null +++ b/adr/ADR-40.md @@ -0,0 +1,486 @@ +# NATS Connection + +| Metadata | Value | +| -------- | -------------------- | +| Date | 2023-10-12 | +| Author | @Jarema | +| Status | Implemented | +| Tags | client, server, spec | + +| Revision | Date | Author | Info | +| -------- | ---------- | -------- | ---------------------------- | +| 1 | 2023-10-12 | @Jarema | Initial draft | +| 2 | 2024-6-24 | @aricart | Added protocol error section | + +## Summary + +This document describes how clients connect to the NATS server or NATS cluster. +That includes topics like: + +- connection process +- reconnect +- tls +- discoverability of other nodes in a cluster + +## Motivation + +Ensuring a consistent way how Clients establish and maintain connection with the +NATS server and provide consistent and predictable behaviour across the +ecosystem. + +## Guide-level Explanation + +### Establishing the connection + +**TODO** Add WebSocket flow. + +#### Minimal example + +1. Clients initiate a network connection to the Server. +2. Server responds with [INFO][INFO] json. +3. Client sends [CONNECT][CONNECT] json. +4. Clients and Server start to exchange PING/PONG messages to detect if the + connection is alive. + +**Note** If clients sets `protocol` field in [Connect][Connect] to equal or +greater than 1, Server can send subsequent [INFO][INFO] on a ongoing connection. +Client needs to handle them appropriately and update server lists and server +info. + +#### Auth flow + +TODO + +#### TLS + +There are two flows available in the Server that enable TLS. + +##### Standard NATS TLS (Explicit TLS) + +This method is available in all NATS Server versions. + +1. Clients initiate a network connection to the Server. +2. Server responds with [INFO][INFO] json. +3. If Server [INFO][INFO] contains `tls_required` set to `true`, or the client + has a tls requirement set to `true`, the client performs a TLS upgrade. +4. Client sends [CONNECT][CONNECT] json. +5. Clients and Server start to exchange PING/PONG messages to detect if the + connection is alive. + +##### TLS First (Implicit TLS) + +This method has been available since NATS Server 2.10.4 + +There are two prerequisites to use this method: + +1. Server config has enabled `handshake_first` field in the `tls` block. +2. The client has set the `tls_first` option set to true. + +**handshake_first** has those possible values: + +- **`false`**: handshake first is disabled. Default value +- `true`: handshake first is enabled and enforced. Clients that do not use this + flow will fail to connect. +- `duration` (i.e. 2s): a hybrid mode that will wait a given time, allowing the + client to follow the `tls_first` flow. After the duration has expired, `INFO` + is sent, enabling standard client TLS flow. +- `auto`: same as above, with some default value. By default it waits 50ms for + TLS upgrade before sending the [INFO][INFO]. + +The flow itself is flipped. TLS is established before the Server sends INFO: + +1. Client initiate a network connection to the Server. +2. Client upgrades the connection to TLS. +3. Server sends [INFO][INFO] json. +4. Client sends [CONNECT][CONNECT] json. +5. Client and Server start to exchange PING/PONG messages to detect if the + connection is alive. + +### Servers discovery + +**Note**: Server will send back the info only + +When Server sends back [INFO][INFO]. It may contain additional URLs to which the +client can make connection attempts. The client should store those URLs and use +them in the Reconnection Strategy. + +A client should have an option to turn off using advertised URLs. By default, +those URLs are used. + +**TODO**: Add more in-depth explanation how topology discovery works. + +### Reconnection Strategies (In progress) + +#### On-Demand reconnect + +Client should have a way that allows users to force reconnection process. This +can be useful for refreshing auth or rebalancing clients. + +When triggered, client will drop connection to the current server and perform +standard reconnection process. That means that all subscriptions and consumers +should be resubscribed and their work resumed after successful reconnect where +all reconnect options are respected. + +For most clients, that means having a `reconnect` method on the +Client/Connection handle. + +#### Detecting disconnection + +There are two methods that clients should use to detect disconnections: + +1. Missing two consecutive PONGs from the Server (number of missing PONGs can be + configured). +2. Handling errors from network connection. + +#### Reconnect process + +When the client detects disconnection, it starts to reconnect attempts with the +following rules: + +1. Immediate reconnect attempt + - The client attempts to reconnect immediately after finding out it has been + disconnected. +2. Exponential backoff with jitter + - When the first reconnect fails, the backoff process should kick in. Default + Jitter should also be included to avoid thundering herd problems. +3. If the Server returned additional URLs, the client should try reconnecting in + random order to each Server on the list, unless randomization option is + disabled in the client [options](#Retain-servers-order). +4. Successful reconnect resets the timers +5. Upon reconnection, clients should resubscribe to all created subscriptions. + +If there is any change in the connection state - connected/disconnected, the +client should have some way of notifying the user about it. This can be a +callback function or any other idiomatic mechanism in a given language for +reporting asynchronous events. + +**Disconnect buffer** Most clients have a buffer that will aggregate messages on +the client side in case of disconnection. It will fill up the buffer and send +pending messages as soon as connection is restored. If buffer will be filled +before the connection is restored - publish attempts should return error noting +that fact. + +## Reference-level Explanation + +### Client options + +Although clients should provide sensible defaults for handling the connection, +in many cases, it requires some tweaking. The below list defines what can be +changed, what it means, and what the defaults are. + +#### Ping interval + +**default**: 2 minutes + +As the client or server might not know that the connection is severed, NATS has +Ping/Pong protocol. Client can set at what intervals it will send a PING to the +server, expecting PONG. If two consecutive PONGs are missed, connection is +marked as lost triggering reconnect attempt. + +It's worth noting that shorter PING intervals can improve responsiveness of the +client to network issues, but it also increases the load on the whole NATS +system and the network itself with each added client. + +#### Max Pings Out + +**default**: 2 + +Sets number of allowed outstanding PONG responses for the client PINGs before +marking client as disconnected and triggering reconnect. + +#### Retry on failed initial connect + +**default: false** + +By default, if a client makes a connection attempt, if it fails, `connect` +returns an error. In many scenarios, users might want to allow the first attempt +to fail as long as clients continue the efforts and notify the progress. + +When this option is enabled, the client should start the initial connection +process and return the standard NATS connection/client handle while in +background connection attempts are continued. + +The client should not wait for the first connection to succeed or fail, as in +some network scenarios, this can take much time. If the first attempt fails, a +standard [Reconnect process] should be performed. + +#### Max reconnects + +**default: 3 / none + +Specifies the number of consecutive reconnect attempts the client will make +before giving up. This is useful for preventing `zombie services` from endlessly +reaching the servers, but it can also be a footgun and surprise for users who do +not expect that the client can give up entirely. + +#### Connection timeout + +**default 5s** + +Specifies how long the client will wait for the network connection to be +established. In some languages, this can hang eternally, and timeout mechanics +might be necessary. In others, the network connection method might have a way to +configure its timeout. + +#### Custom reconnect delay + +**Default: none** + +If fine-grained control over reconnect attempts intervals is needed, this option +allows users to specify one. Implementation should make sense in a given +language. For example, it can be a callback +`fn reconnect(attempt: int) -> Duration`. + +#### Disconnect buffer + +If given client supports storing messages during disconnect periods, this option +allows to tweak the number of stored messages. It should also allow disable +buffering entirely. + +#### Tls required + +**default: false** If set, the client enforces the TLS, whether the Server also +requires it or not. + +If `tls://` scheme is used in the connection string, this also enforces tls. + +#### Ignore advertised servers + +**default: false** When connecting to the Server, it may send back a list of +other servers in the cluster of which it is aware. This can be very helpful for +discoverability and removes the need for the client to pass all servers in +`connect`, but it also may be unwanted if, for example, some servers URLs are +unreachable for a given client. + +#### Retain servers order + +**default: false** By default, if many server addresses are passed in the +connect string or array, the client will try to connect to them in random order. +This helps healthy connection distribution, but if in a specific case list +should be treated as a preference list, randomization may be turned off. + +This function can be expressed "enable retaining order" or "disable +randomization" depending on what is more idiomatic in given language. + +### Protocol Commands and Grammar + +#### INFO + +[LINK][LINK] + +Send by the Server before or after establishing TLS, depending of flow used. It +contains information about the Server, the nonce, and other server URLs to which +the client can connect. + +#### CONNECT + +[CONNECT][CONNECT] + +Send by the client in response to INFO. Contains information about client, +including optional signature, client version and connection options. + +#### Ping Pong + +This is a mechanism to detect broken connections that may not be reported by the +network connection in a given language. + +If the Server sends `PING`, the client should answer with `PONG`. If the Client +sends `PING`, the Server should answer with `PONG`. + +If two (configurable) consecutive `PONGs are missed, the client should treat the +connection as broken, and it should start reconnect attempts. + +The default interval for PING is 2 minutes. + +### Error Handling (TODO) + +The `-ERR` protocol message is an important signal for clients about things that +are incorrect from the perspective of Permissions or Authorization. + +A note about implementation - the current format of the errors is simple, but +messages are not typed in a way that is simple for clients to understand what +should happen - in many cases the server will disconnect th client. In other +cases it is just a runtime error that an update in configuration at runtime may +re-enable the client to do what was rejected previously. However, the client has +no way to know whether the server will disconnect it or not. + +In cases where the error is surfaced during connection it creates the nuance +that it is difficult for the client to know if the error is recoverable (simply +attempt to reconnect later) or not. In some cases a client connection will never +resolve unless the number of maximum reconnect attempts is specified. + +#### Permissions Violation + +`Permissions Violation` means that the client tried to publish or subscribe on a +subject for which it has no permissions. This type of error can happen or +surface at any time, as changes to permissions intentionally or not can happen. +This means that even if the subscription has been working, it is possible that +it will not in the future if the permissions are altered. + +The message will include `/(Publish|Subscription) to (\S+)/` this will indicate +whether the error is related to a publish or subscription operation. Note that +you should be careful in how you write your matchers as the message could change +slightly or sport additional information (as you'll see below). + +For publish permission errors, it's hard to notify the client at the point of +failure unless the client is synchronous. But the standard async error +notification should be sufficient. In the case of request reply, since there's a +subscription handling the response, this means that you can search subscriptions +related to request and reply subjects, and notify them via the response +mechanism for the request depending on the type of operation that was rejected. + +For subscription errors, a second level parse for `/using queue "(\S+)"/` will +yield the `queue` if any that was used during the subscribe operation. This +means that a client may have permissions on a subscription, but not in a +specific queue or some other permutation of the subject/queue. + +The server unfortunately doesn't make it easy for the client to know the actual +subscription (SID) hosting the error but the logic for processing is simple: +notify the first subscription that matches the subject and queue name (this +assumes you track the subject and queue name in your internal subscription +representation) - the server will send multiple error protocol messages (one per +offense) so if multiple subscriptions, you will be able to notify all of them. + +For subscriptions, errors are _terminal_ for the subscription, as the server +cancels the clients interest. so the client will never get any messages on it. +It is very convenient for client user code to receive an error using some +mechanism associated with the subscription in question as this will simplify the +handling of the client code. + +It is also useful to have some sort of Promise/Future/etc that will get resolved +when a subscription closes (will not yield any more messages) - The +Promise/Future can resolve to an error or void (not thrown) which the client can +inspect for the reason if any why the subscription closed. Throwing an error is +discouraged, as this would create a possibility of crashing the client. Clients +can then use this information to perform their own error handling which may +require taking the service offline if the subscription is vital for its +operation. + +Note that regardless of a localized error handling mechanism, you should also +notify the async error handler as you don't know exactly where the client code +is looking for errors. + +#### Authorization Violation + +`Authorization Violation` is sent whenever the credentials for a client are not +accepted. This is followed by a server initiated disconnect. Clients will +normally reconnect (depending on their connection options). If the client +closes, this should be reported as the last error. + +#### User Authentication Expired + +`User Authentication Expired` protocol error happens whenever credentials for +the client expire while the client is connected to the server. It is followed by +a server disconnect. This error should be notified in the async handler. On +reconnect the client is going to be rejected with `Authorization Violation` and +follow its reconnect logic. + +#### Account Expiration + +`Account Authentication Expired` is sent whenever the account JWT expires and a +client for the account is connected. This will result in a disconnect initiated +by the server. On reconnect the client will be rejected with +`Authorization Violation` until the account configuration is refreshed on the +server. The client will follow its reconnect logic. + +#### Secure Connection - TLS Required + +`Secure Connection - TLS Required` is sent if the client is trying to connect on +a server that requires TLS. + +> [!IMPORTANT] +> The client should have done extensive ServerInfo investigation +> and determined that this would have been a failure when initiating the +> connection. + +#### Maximum Number of Connections + +`maximum connections exceeded` server limit on number of connections reached. +Server will send to the client the `-ERR maximum connections exceeded`, client +possibly go in reconnect loop. + +The server can also send +`Connection throttling is active. Please try again later.` when too many TLS +connections are in progress. This should be treated as +`maximum connections exceeded` or reworked on the server to send this error +instead. Note that this can happen if when the tls server option +[`connection_rate_limit`](https://github.com/nats-io/nats-server/blob/main/server/opts.go#L4557) +is set. + +#### Max Payload Violation + +`Maximum Payload Violation` is sent to the client if it attempts to publish more +data than it is allowed by `max_payload`. The server will disconnect the client +after sending the protocol error. Note that clients should test payload sizes +and fail publishes that exceed the server configuration, as this allow the error +to be localized when possible to the user code that caused the error. + +#### Maximum Subscriptions Exceeded + +`maximum subscriptions exceeded` is sent to the client if attempts to create +more subscriptions than it the account is allowed. The error is not terminal to +the connection. + +#### User Authentication Revoked + +`User Authentication Revoked` this is reported when an account is updated and +the user is revoked in the account. On connects where the user is already +revoked, it is just an `Authorization Error`. On actual experimentation, the +client never saw `User Authentication Revoked`, and instead was just +disconnected. Reconnect was greeted with a `Authorization Error`. + +#### Invalid Client Protocol + +`invalid client protocol` sent to the client if the protocol version from the +client doesn't match. Client is disconnected when this error is sent. + +> [!NOTE] +> Currently, this is not a concern since presumably, a server will be +> able to deal with protocol version 1 when protocol upgrades. + +#### No Responders Requires Headers + +`no responders requires headers support` sent if the client requests no +responder, but rejects headers. Client is disconnected when this error is sent. +Current clients hardcode `headers: true`, so this error shouldn't be seen by +clients. + +> [!IMPORTANT] +> `headers` connect option shouldn't be exposed by the clients - +> this is a holdover from when clients opted in to `headers`. + +#### Failed Account Registration + +`Failed Account Registration` an internal error while registering an account. +(Looking for reproducible test). + +#### Invalid Publish Subject + +`Invalid Publish Subject` (this requires the server in pedantic mode). Client is +not disconnected when this error is sent. Note that for subscribe operations, +depending on the separator (space) you may inadvertently specify a queue. In +such cases there will be no error, your subscription will simply be part of a +queue. If multiple spaces or some other variant, the server will treat it as a +protocol error. + +#### Unknown Protocol Operation + +`Unknown Protocol Operation` this error is sent if the server doesn't understand +a command. This is followed by a disconnect. + +#### Other Errors (not necessarily seen by the client) + +- `maximum account active connections exceeded` not notified to the client, the + client connecting will be disconnected (seen as a connection refused.) + +### Security Considerations + +Discuss any additional security considerations pertaining to the TLS +implementation and connection handling. + +## Future Possibilities + +Smart Reconnection could be a potential big improvement. + +[INFO]: https://beta-docs.nats.io/ref/protocols/client#info +[CONNECT]: https://beta-docs.nats.io/ref/protocols/client#connect diff --git a/adr/ADR-41.md b/adr/ADR-41.md new file mode 100644 index 00000000..fd820811 --- /dev/null +++ b/adr/ADR-41.md @@ -0,0 +1,292 @@ +# NATS Message Path Tracing + + +| Metadata | Value | +|----------|-----------------------| +| Date | 2024-02-22 | +| Author | @ripienaar, @kozlovic | +| Status | Implemented | +| Tags | observability, server | + + +| Revision | Date | Author | Info | +|----------|------------|------------|----------------| +| 1 | 2024-02-22 | @ripienaar | Initial design | + +## Context and Problem Statement + +As NATS networks become more complex with Super Clusters, Leafnodes, multiple Accounts and JetStream knowing the path that messages take through the system is hard to predict. + +Further, when things go wrong, it's hard to know where messages can be lost, denied or delayed. + +This describes a feature of the NATS Server 2.11 that allows messages to be traced throughout the NATS network. + +## Prior Work + +NATS supports tracking latency of Request-Reply service interactions, this is documented in [ADR-3](adr/ADR-3.md). + +## Design + +When tracing is activated every subsystem that touches a message will produce Trace Events. These Events are aggregated per server and published to a destination subject. + +A single message published activating tracing will therefor result in potentially a number of Trace messages - one from each server a message traverse, each holding potentially multiple Trace Events. + +At present the following _Trace Types_ are supported + + * Ingress (`in`) - The first event that indicates how the message enters the Server, client connection, route, gateway etc + * Subject Mapping (`sm`) - Indicates the message got transformed using mappings that changed it's target subject + * Stream Export (`se`) - Indicates the message traversed a Stream Export to a different account + * Service Import (`si`) - Indicates the message traversed a Service Import to a different account + * JetStream (`js`) - Indicates the message reached a JetStream Stream + * Egress (`eg`) - The final event that indicates how the message leaves a Server + +## Activation + +Not all messages are traced and there is no flag to enable it on all messages. Activation is by adding Headers to the message. + +### Ad-hoc activation + +This mode of Activation allows headers to be added to any message that declares where to deliver the traces and inhibit delivery to the final application. + +| Header | Description | +|-------------------|------------------------------------------------------------------------------------------------------| +| `Nats-Trace-Dest` | A subject that will receive the Trace messages | +| `Nats-Trace-Only` | Prevents delivery to the final client, reports that it would have been delivered (`1`, `true`, `on`) | +| `Accept-Encoding` | Enables compression of trace payloads (`gzip`, `snappy`) | + +The `Nats-Trace-Only` header can be used to prevent sending badly formed messages to subscribers, the servers will trace the message to its final destination and report the client it would be delivered to without actually delivering it. Additionally when this is set messages will also not traverse any Route, Gateway or Leafnode that does not support the Tracing feature. + +### Trace Context activation + +Messages holding the standard `traceparent` header as defined by the [Trace Context](https://www.w3.org/TR/trace-context/) specification can trigger tracing based on the `sampled` flag. + +In this case no `Nats-Trace-Dest` header can be set to indicate where the messages will flow, it requires enabling using an account setting: + +``` +accounts { + A { + users: [{user: a, password: pwd}] + msg_trace: { + dest: "a.trace.subj" + sampling: "100%" + } + } +} +``` + +Here we set the `msg_trace` configuration for the `A` account, this enables support for Trace Context and will deliver all messages with the `traceparent` header and the `sampled` flag set to true. + +Note the `sampling`, here set to 100% which is the default, will further trigger only a % of traces that have the `sampled` value in the `traceparent` header. This allow you to specifically sample only a subset of messages traversing NATS while your micro services will sample all. + +When this feature is enabled any message holding the `Nats-Trace-Dest` header as in the previous section will behave as if the `traceparent` header was not set at all. In essence the Ad-Hoc mode has precedence. + +### Cross Account Tracing + +By default a trace will end at an account boundary when crossing an Import or Export. This is a security measure to restrict visibility into foreign accounts and require opt-in to allow. + +``` +accounts { + B { + exports = [ + // on a service the direction of flow is into the exporting + // account, so the exporter need to allow tracing + { service: "nats.add", allow_trace: true } + ] + + imports = [ + // on a stream import the direction of flow is from exporter into + // the importer, so the importer need to allow tracing + {stream: {account: A, subject: ticker}, allow_trace: true} + ] + } +} +``` + +## nats CLI + +The current `main` and nightly builds of `nats` includes the `nats trace` command that is built upon these features. + +This uses a helper package to receive, parse, sort and present a series of related Traces, the source can be found in [github.com/nats-io/jsm.go/api/server/tracing](https://github.com/nats-io/jsm.go/tree/main/api/server/tracing). + +``` +$ nats trace demo +Tracing message route to subject demo + +Client "NATS CLI Version development" cid:4219 cluster:"sfo" server:"n2-sfo" version:"2.11.0-dev" +==> Gateway "n1-lon" gid:727 + Gateway "n2-sfo" gid:735 cluster:"lon" server:"n1-lon" version:"2.11.0-dev" + ~~> Leafnode "leaf1" lid:5391 + Leafnode "n1-lon" lid:8 server:"leaf1" version:"2.11.0-tracing8" + --C Client "NATS CLI Version development" cid:10 subject:"demo" + +Legend: Client: --C Router: --> Gateway: ==> Leafnode: ~~> JetStream: --J Error: --X + +Egress Count: + + Gateway: 1 + Leafnode: 1 + Client: 1 +``` + +Here we can see: + +1. Message entered the `sfo` Cluster via a Client in the server `n2-sfo`. +2. The server `n2-sfo` published it to its Gateway called `n1-lon` using connection `gid:727` +3. The server `n1-lon` received from its Gateway called `n2-sfo` with connection `gid:735` +4. The server `n1-lon` published it to its Leafnode connection `leaf1` using connection `lid:5391` +5. The server `leaf1` received from its Leafnode called `leaf1` with connection `lid:8` +6. The server `leaf1` published the message to a Client with the connection name "NATS CLI Version development" over connection `cid:10` + +This is a set of 3 Trace messages holding 6 Trace Events in total. + +## Trace Message Formats + +The full detail of the trace message types are best found in the NATS Server source code in [server/msgtrace.go](https://github.com/nats-io/nats-server/blob/main/server/msgtrace.go), here we'll call out a few specifics about these messages. + +Given this sample message - a `MsgTraceEvent`: + +```json +{ + "server": { + "name": "n3-lon", + "host": "n3-lon.js.devco.net", + "id": "NCCPZOHDJ4KZQ35FU7EDFNPWXILA7U2VJAUWPG7IFDWUADDANNVOWKRV", + "cluster": "lon", + "domain": "hub", + "ver": "2.11.0-dev", + "tags": [ + "lon" + ], + "jetstream": true, + "flags": 3, + "seq": 151861, + "time": "2024-02-22T13:18:48.400821739Z" + }, + "request": { + "header": { + "Nats-Trace-Dest": [ + "demo" + ] + }, + "msgsize": 36 + }, + "hops": 1, + "events": [ + { + "type": "in", + "ts": "2024-02-22T13:18:48.400595984Z", + "kind": 0, + "cid": 5390, + "name": "NATS CLI Version development", + "acc": "one", + "subj": "x" + }, + { + "type": "eg", + "ts": "2024-02-22T13:18:48.40062456Z", + "kind": 0, + "cid": 5376, + "name": "NATS CLI Version development", + "sub": "x" + }, + { + "type": "eg", + "ts": "2024-02-22T13:18:48.400649361Z", + "kind": 1, + "cid": 492, + "name": "n1-lon", + "hop": "1" + } + ] +} +``` + +Lets have a quick look at some key fields: + +|Key|Notes| +|`server`|This is a standard Server Info structure that you will see in many of our advisories, just indicates which server sent the event| +|`request`|Details about the message being traced, more details about `Nats-Trace-Hop` below| +|`hops`|How many remote destination (routes, gateways, Leafnodes) this server is sending the message to| +|`events`|A list of `MsgTraceEvents` that happened within the server related to this message, see below| + +### Sorting + +These events form a Directed Acyclic Graph that you can sort using the Hop information. The Server that sends the message to another Route, Gateway or Leafnode will indicate it will publish to a number of `hops` and each server that receives the message will have the `Nats-Trace-Hop` header set in its `request`. + +The origin server - the server that receives the initial message with the `Nats-Trace-Dest` header - will have a hops count indicating to how many other servers (routes, leafs, gateways) it has forwarded the message to. This is not necessarily the total number of servers that the message will traverse. + +Each time a server forwards a message to a remote server, it appends a number to its own `Nats-Trace-Hop` header value. Since the origin server does not have one, if it forwards a message to say a ROUTE, that remote server would receive the message with a new header `Nats-Trace-Hop` with the value of `1`, then the origin server forwards to a GATEWAY, and that server would receive the message with `Nats-Trace-Hop` value set to `2`. + +Each of these servers in turn, if forwarding a message to another server, would add an incremental number to their existing `Nats-Trace-Hop` value, which would result in `1.1`, and `2.1`, etc.. + +Take a simple example of the origin server that has a LEAF server, which in turn as another LEAF server (daisy chained). If a message with tracing is forwarded, the trace message emitted from the origin server would have hops to `1` (since the origin server forwarded directly only to the first LEAF remote server). The trace emitted from the first LEAF would have a `Nats-Trace-Hop` header with value `1`. It would also have hops set to `1` (assuming there was interest in the last LEAF server). Finally, the last LEAF server would have a `Nats-Trace-Hop` of `1.1` and would have no hops field (since value would be 0). + +You would associate message either by using a unique `Nats-Trace-Dest` subject or by parsing the `traceparent` to get the trace and span IDs. + +### Trace Events + +The `events` list contains all the events that happened in a given server. We see here that there are different types of event according to the table below: + +| Type | Server Data Type | Description | +|------|--------------------------|-----------------| +| `in` | `MsgTraceIngress` | Ingress | +| `sm` | `MsgTraceSubjectMapping` | Subject Mapping | +| `se` | `MsgTraceStreamExport` | Stream Export | +| `si` | `MsgTraceServiceImport` | Service Import | +| `js` | `MsgTraceJetStream` | JetStream | +| `eg` | `MsgTraceEgress` | Egress | + +We also see a `kind` field, this holds the NATS Servers client Kind, at present these are the values: + +| Kind | Server Constant | Description | +|------|--------------------|---------------------| +| `0` | `server.CLIENT` | Client Connection | +| `1` | `server.ROUTER` | Router Connection | +| `2` | `server.GATEWAY` | Gateway Connection | +| `3` | `server.SYSTEM` | The NATS System | +| `4` | `server.LEAF` | Leafnode Connection | +| `5` | `server.JETSTREAM` | JetStream | +| `6` | `server.ACCOUNT` | Account | + +You may not see all Kinds in traces but this is the complete current list. + +Using these Kinds and Types we can understand the JSON data above: + + * `Ingress` from `Client` connection `cid:5390` + * `Egress` to `Client` connection `cid:5376` + * `Egress` to `Router` connection `rid:492` called `n1-lon` + +This indicates a Client connected to this server (`n3-lon`) published the message it was received by a client connected to the same server and finally sent over a Route to a client on another server (`n1-lon`). + +### Specific Trace Types + +Most of the trace types are self explanatory but I'll call out a few specific here, the names match those in `msgtrace.go` in the Server source. + +#### MsgTraceEgress and MsgTraceIngress + +These messages indicates the message entering and leaving the server via a `kind` connection. It includes the `acc` it's being sent for, subscription (`sub`) and Queue Group (`queue`) on the Egress. + +**Note** that for non CLIENT egresses, the subscription and queue group will be omitted. This is because NATS Servers optimize and send a single message across servers, based on a known interest, and let the remote servers match local subscribers. So it would be misleading to report the first match that caused a server to forward a message across the route as the egress' subscription. + +Here in cases of ACLs denying a publish the `error` will be filled in: + +```json +{ + "events": [ + { + "type": "in", + "ts": "2024-02-22T13:44:18.326658996Z", + "kind": 0, + "cid": 5467, + "name": "NATS CLI Version development", + "acc": "one", + "subj": "deny.x", + "error": "Permissions Violation for Publish to \"deny.x\"" + } + ] +} +``` + +#### MsgTraceJetStream + +The values are quite obvious, the `nointerest` boolean indicates that an `Interest` type Stream did not persist the message because it had no interest. \ No newline at end of file diff --git a/adr/ADR-42.md b/adr/ADR-42.md new file mode 100644 index 00000000..9a142b93 --- /dev/null +++ b/adr/ADR-42.md @@ -0,0 +1,239 @@ +# Pull Consumer Priority Groups + +| Metadata | Value | +|----------|-------------------| +| Date | 2024-05-14 | +| Author | @ripienaar | +| Status | Approved | +| Tags | jetstream, server | + + +| Revision | Date | Author | Info | +|----------|------------|------------|----------------| +| 1 | 2024-05-14 | @ripienaar | Initial design | +| 2 | 2024-10-15 | @jarema | Add client implementation details | + +## Context and Problem Statement + +We have a class of feature requests that all come down to adding behaviours on a consumer that affects delivery to groups +of clients who interact with the consumer. + +Some examples: + + * A single client should receive all messages from a consumer and if it fails another should take over + * Groups of related clients should be organised such that certain clients only receive messages when various conditions are met + * Groups of clients should access a Consumer but some should have higher priority than others for receiving messages + * A consumer should deliver messages to clients in a specific grouping such that related messages all go to the same client group + +The proposed feature here address some of these needs while paving the way for future needs to be addressed within this +framework and under this umbrella feature called `Pull Consumer Groups`. + +The current focus is around providing building blocks that can be used to solve higher order problems client-side. + +Related prior work: + + * Proposed [Consumer Groups](https://github.com/nats-io/nats-architecture-and-design/pull/36) + * Proposed [Partitioned consumer groups and exclusive consumer](https://github.com/nats-io/nats-architecture-and-design/pull/263) + * Proposed [Consumer Owner ID](https://github.com/nats-io/nats-server/pull/5157) + +## General Overview + +We introduce 2 settings on `ConsumerConfig` that activates these related features: + +```go +{ + PriorityGroups: ["", ...], + PriorityPolicy: "" +} +``` + +The presence of the `PriorityPolicy` set to a known policy activates the set of features documented here, `PriorityGroups` +require at least one entry. + +Technically in some of the proposed policies the `PriorityGroups` have no real meaning today, but we keep it for consistency +and allow for future features to be added that would be per-group without then requiring all clients to be updated. +Future message grouping features would require groups to be listed here. + +In the initial implementation we should limit `PriorityGroups` to one per consumer only and error should one be made with +multiple groups. In future iterations multiple groups will be supported along with dynamic partitioning of stream +data. + +This is only supported on Pull Consumers, configuring this on a Push consumer must raise an error. + +> [!NOTE] +> Some aspects of configuration is updatable, called out in the respective sections, but we cannot support updating a +> consumer from one with groups to one without and vice versa due to the internal state tracking that is required and +> the fact that generally clients need adjustment to use these features. We have identified some mitigation approaches +> to ease migration but will wait for user feedback. We also cannot switch between different policies. + +## Priority Policies + +### `overflow` policy + +Users want certain clients to only get messages when certain criteria are met. + +Imagine jobs are best processed locally in `us-east-1` but at times there might be so many jobs in that region that +`us-west-1` region handling some overflow, while less optimal in terms of transit costs and latency, would be desirable +to ensure serving client needs as soon as possible. + +The `overflow` policy enables Pull requests to be served only if criteria like `num_pending` and `num_ack_pending` are +above a certain limit, otherwise those Pull requests will sit idle in the same way that they would if no messages were +available (receiving heartbeats etc). + +```go +{ + PriorityGroups: ["jobs"], + PriorityPolicy: "overflow", + AckPolicy: "explicit", + // ... other consumer options +} +``` + +Here we state that the Consumer has one group called `jobs`, it is operating on the `overflow` policy and requires `explicit` +Acks, any other Ack policy will produce an error. If we force this ack policy in normal use we should error in Pedantic mode. + +Pull requests will have the following additional fields: + + * `"group": "jobs"` - the group the pull belongs to, pulls not part of a valid group will result in an error + * `"min_pending": 1000` - only deliver messages when `num_pending` for the consumer is >= 1000 + * `"min_ack_pending: 1000` - only deliver messages when `ack_pending` for the consumer is >= 1000 + +If `min_pending` and `min_ack_pending` are both given either being satisfied will result in delivery (boolean OR). + +In the specific case where MaxAckPending is 1 and a pull is made using `min_pending: 1` this should only be served when +there are no other pulls waiting. This means we have to give priority to pulls without conditions over those with when +considering the next pull that will receive a message. + +Once multiple groups are supported consumer updates could add and remove groups. + +### `pinned_client` policy + +Users want to have a single client perform all the work for a consumer, but they also want to have a stand-by client that +can take over when the primary, aka `pinned` client, fails. + +**NOTE: We should not describe this in terms of exclusivity as there is no such guarantee, there will be times when one +client think it is pinned when it is not because the server switched.** + +The `pinned_client` policy provides server-side orchestration for the selection of the pinned client. + +```go +{ + PriorityGroups: ["jobs"], + PriorityPolicy: "pinned_client", + PriorityTimeout: 120*time.Second, + AckPolicy: "explicit", + // ... other consumer options +} +``` + +This configuration states: + + * We have 1 group defined and all pulls have to belong to this group + * The policy is `pinned_client` that activates these behaviors + * When a pinned client has not done any pulls in the last 120 seconds the server will switch to another client + * AckPolicy has to be `explicit`. If we force this ack policy in normal use we should error in Pedantic mode + +A pull request will have the following additional fields: + + * `"group": "jobs"` - the group the pull belongs to, pulls not part of a valid group will result in an error + * `"id": "xyz"` - the pinned client will have this ID set to the one the server last supplied (see below), otherwise + this field is absent + +After selecting a new pinned client, the first message that will be delivered to this client, and all future ones, will +include a Nats-Pin-Id: xyz header. The client that gets this message should at that point ensure that all future pull +requests have the same ID set. + +When a new pinned client needs to be picked - after timeout, admin action, first delivery etc, this process is followed: + + 1. Stop delivering messages for this group, wait for all in-flight messages to be completed, continue to serve heartbeats + 2. Pick the new pinned client + 3. Store the new pinned `nuid` + 4. Deliver the message to the new pinned client with the ID set + 5. Create an advisory that a new pinned client was picked + 6. Respond with a 4xx header to any pulls, including waiting ones, that have a different ID set. Client that received this error will clear the ID and pull with no ID + +If no pulls from the pinned client is received within `PriorityTimeout` the server will switch again using the same flow as above. + +Future iterations of this feature would introduce the concept of a priority field so clients can self-organise but we decided +to deliver that in a future iteration. + +Clients can expose call-back notifications when they become pinned (first message with `Nats-Pin-Id` header is received) and +when they lose the pin (they receive the 4xx error when doing a pull with a old ID). + +A new API, `$JS.API.CONSUMER.UNPIN...`, can be called which will clear the ID and trigger a client switch as above. + +Consumer state to include a new field `PriorityGroups` of type `[]PriorityGroupState`: + +```go +type PriorityGroupState struct { + Group string `json:"name"` + PinnedClientId string `json:"pinned_id,omitempty"` + PinnedTs *time.Time `json:"pinned_ts,omitempty"` +} +``` + +Future iterations will include delivery stats per group. + +Once multiple groups are supported consumer updates could add and remove groups. Today only the `PriorityTimeout` +setting supports being updated. + +#### Advisories + +We will publish advisories when a switch is performed and when a pin is lost. + +```golang +const JSAdvisoryConsumerPinnedPre = "$JS.EVENT.ADVISORY.CONSUMER.PINNED" +const JSAdvisoryConsumerUnpinnedPre = "$JS.EVENT.ADVISORY.CONSUMER.UNPINNED" +const JSConsumerGroupPinnedAdvisoryType = "io.nats.jetstream.advisory.v1.consumer_group_pinned" + +// JSConsumerGroupPinnedAdvisory indicates that a group switched to a new pinned client +type JSConsumerGroupPinnedAdvisory struct { + TypedEvent + Account string `json:"account,omitempty"` + Stream string `json:"stream"` + Consumer string `json:"consumer"` + Domain string `json:"domain,omitempty"` + Group string `json:"group"` + PinnedClientId string `json:"pinned_id"` + Client *ClientInfo `json:"client"` // if available +} + +const JSConsumerGroupUnPinnedAdvisoryType = "io.nats.jetstream.advisory.v1.consumer_group_unpinned" + +// JSConsumerGroupUnPinnedAdvisory indicates that a pin was lost +type JSConsumerGroupUnPinnedAdvisory struct { + TypedEvent + Account string `json:"account,omitempty"` + Stream string `json:"stream"` + Consumer string `json:"consumer"` + Domain string `json:"domain,omitempty"` + Group string `json:"group"` + // one of "admin" or "timeout", could be an enum up to the implementor to decide + Reason string `json:"reason"` +} +``` + +# Client side implementation + +### Pull Requests changes + +To use either of the policies, a client needs to +expose a new options on `fetch`, `consume`, or any other method that are used for pull consumers. + +#### Groups +- `group` - mandatory field. If consumer has configured `PriorityGroups`, every Pull Request needs to provide it. + +#### Overflow +When Consumer is in `overflow` mode, user should be able to optionally specify thresholds for pending and ack pending messages. + +- `min_pending` - when specified, this Pull request will only receive messages when the consumer has at least this many pending messages. +- `min_ack_pending` - when specified, this Pull request will only receive messages when the consumer has at least this many ack pending messages. + +#### Pinning +In pinning mode, user does not have to provide anything beyond `group`. +Client needs to properly handle the `id` sent by the server. That applies only to ` Consume`. Fetch should not be supported in this mode. +At least initially. + +1. When client receives the `id` from the server via `Nats-Pin-Id` header, it needs to store it and use it in every subsequent pull request for this group. +2. If client receives `423` Status error, it should clear the `id` and continue pulling without it. +3. Clients should implement the `unpin` method described in this ADR. diff --git a/adr/ADR-43.md b/adr/ADR-43.md new file mode 100644 index 00000000..ec795542 --- /dev/null +++ b/adr/ADR-43.md @@ -0,0 +1,85 @@ +# JetStream Per-Message TTL + +| Metadata | Value | +|----------|---------------------------| +| Date | 2024-07-11 | +| Author | @ripienaar | +| Status | Approved | +| Tags | jetstream, client, server | + +## Context and motivation + +Streams support a one-size-fits-all approach to message TTL based on the MaxAge setting. This causes any message in the +Stream to expire at that age. + +There are numerous uses for a per-message version of this limit, some listed below: + + * KV tombstones are a problem in that they forever clog up the buckets with noise, these could have a TTL to make them expire once not useful anymore + * Server-applied limits can result in tombstones with a short per message TTL so that consumers can be notified of limits being processed. Useful in KV watch scenarios being notified about TTL removals + * A stream may have a general MaxAge but some messages may have infinite retention, think a schema or type hints in a KV bucket that is forever while general keys have TTLs + +Related issues [#3268](https://github.com/nats-io/nats-server/issues/3268) + +## Per-Message TTL + +We will allow a message to supply a TTL using a header called `Nats-TTL` followed by the duration as seconds. + +The duration will be used by the server to calculate the deadline for removing the message based on its Stream +timestamp and the stated duration. + +The TTL may not exceed the Stream MaxAge. The shortest allowed TTL would be 1 second. When no specific TTL is given +the MaxAge will apply. + +Setting the header `Nats-No-Expire` to `1` will result in a message that will never be expired. + +A TTL of zero will be ignored, any other unparsable value will result in a error reported in the Pub Ack and the message +being discarded. + +## Limit Tombstones + +Several scenarios for server-created tombstones can be imagined, the most often requested one though is when MaxAge +removes last value (ie. the current value) for a Key. + +In this case when the server removes a message and the message is the last in the subject it would place a message +with a TTL matching the Stream configuration value. The following headers would be placed: + +``` +Nats-Applied-Limit: MaxAge +Nats-TTL: 1 +``` + +The `Nats-Limit-Applied` field is there to support future expansion of this feature. + +This behaviour is off by default unless opted in on the Stream Configuration. + +## Publish Acknowledgements + +We could optionally extend the `PubAck` as follows: + +```golang +type PubAck struct { + MsgTTL uint64 `json:"msg_ttl,omitempty"` +} +``` + +This gives clients a chance to confirm, without Stream Info or should the Stream be edited after Info, if the TTL +got applied. + +## Stream Configuration + +Weather or not a stream support this behavior should be a configuration opt-in. We want clients to definitely know +when this is supported which the opt-in approach with a boolean on the configuration would make clear. + +We have to assume someone will want to create a replication topology where at some point in the topology these tombstone +type messages are retained for an audit trail. So a Stream with this feature enabled can replicate to one with it +disabled and all the messages that would have been TTLed will be retained. + +```golang +type StreamConfig struct { + // AllowMsgTTL allows header initiated per-message TTLs + AllowMsgTTL bool `json:"allow_msg_ttl"` + + // LimitsTTL activates writing of messages when limits are applied with a specific TTL + LimitsTTL time.Duration `json:"limits_ttl"` +} +``` diff --git a/adr/ADR-44.md b/adr/ADR-44.md new file mode 100644 index 00000000..c0cf30e2 --- /dev/null +++ b/adr/ADR-44.md @@ -0,0 +1,193 @@ +# Versioning for JetStream Assets + +| Metadata | Value | +|----------|-----------------------| +| Date | 2024-07-22 | +| Author | @ripienaar | +| Status | Partially Implemented | +| Tags | jetstream, server | + +# Context and Problem Statement + +As development of the JetStream feature progress there is a complex relationship between connected-server, JetStream +meta-leader and JetStream asset host versions that requires careful consideration to maintain compatibility. + + * The server a client connects to might be a Leafnode on a significantly older release and might not even have + JetStream enabled + * The meta-leader could be on a version that does not support a feature and so would not forward the API fields it + is unaware of to the assigned servers + * The server hosting an asset might not match the meta-leader and so can't honor the request for a certain + configuration and would silently drop fields + +In general our stance is to insist on homogenous cluster versions in the general case but it's not reasonable to expect this +for leafnode servers nor is it possible during upgrading and other maintenance windows. + +Our current approach is to check the connected-server version and project that it's representitive of the cluster as +a whole but this is a known incorrect approach especially for Leafnodes as mentioned above. + +We have considered approaches like fully versioning the API but this is unlikely to work for our case or be accepted +by the team. Versioning the API would anyway not alleviate many of the problems encountered when upgrading and +downgrading servers hosting long running assets. As a result we are evaluating some more unconventional approaches +that should still improve our overall stance. + +This ADR specifies a way for the servers to expose some versioning information to help clients improve the +compatability story. + +# Solution Overview + +In lieu of enabling fine grained API versioning we want to start thinking about asset versioning instead, the server +should report it's properties when creating an asset and it should report similar properties when hosting an asset. + +Fine grained API versioning would not really fix the entire problem as our long-lived assets would have to be upgraded +over time between API versions as they get updated with new features. + +So we attempt to address both classes of problem here by utilizing Metadata and a few new server features and concepts. + +# API Support Level + +The first concept we wish to introduce is the concept of a number that indicates the API level of the servers +JetStream support. + +| Level | Versions | +|-------|----------| +| 0 | < 2.11.0 | +| 1 | 2.11.x | +| 2 | 2.12.x | + +While here it's shown incrementing at the major boundaries it's not strictly required, if we were to introduce a +critical new feature mid 2.11 that could cause a bump in support level mid release without it being an issue - we +do not require strict SemVer adherence. + +The server will calculate this for a Stream and Consumer configuration. Here is example for the 2.11.x. It's not +anticipated we would have to keep supporting versions for every asset for ever, it should be sufficient to support +the most recent ones corresponding to the actively supported server versions. + +```golang +func (s *Server) setConsumerAssetVersionMetadata(cfg *ConsumerConfig, create bool) { + if cfg.Metadata == nil { + cfg.Metadata = make(map[string]string) + } + + if create { + cfg.Metadata[JSCreatedVersionMetadataKey] = VERSION + cfg.Metadata[JSCreatedLevelMetadataKey] = JSFeatureLevel + } + + featureLevel := "0" + + // Added in 2.11, absent | zero is the feature is not used. + // one could be stricter and say even if its set but the time + // has already passed it is also not needed to restore the consumer + if cfg.PauseUntil != nil && !cfg.PauseUntil.IsZero() { + featureLevel = "1" + } + + cfg.Metadata[JSRequiredFeatureMetadataKey] = featureLevel +} +``` + +In this way we know per asset what server feature set it requires and only the server need this logic vs all the +clients if the client had to assert `needs a server of at least level 1`. + +We have to handle updates to asset configuration since an update might use a feature only found in newer servers. + +Servers would advertise their supported API level in `jsz`, `varz` and `$JS.API.INFO`. It should also be logged +at JetStream startup. + +## When to increment API Level + +Generally when adding any new feature/field to the `StreamConfig` or `ConsumerConfig`. Especially when the field was set +by a user and the asset should be loaded in offline mode when the feature is not supported by the server. + +If a new feature is added, the API level only needs to be incremented if another feature planned for the same release +didn't already increment the API level. Meaning if multiple new features are added within the same release cycle, the +API level only needs to be incremented once and not for every new feature. + + +# Server-set metadata + +We'll store current server and asset related information in the existing `metadata` field allowing us to expand this in +time, today we propose the following: + +| Name | Description | +|----------------------------------|-----------------------------------------------------------| +| `_nats.server.version` | The current server version hosting an asset | +| `_nats.server.api_level` | The current server API level hosting an asset | +| `_nats.server.require.api_level` | The required API level to start an asset | +| `_nats.created.server.version` | The version of the server that first created this asset | +| `_nats.created.server.api_level` | The API level of the server that first created this asset | + +We intend to store some client hints in here to help us track what client language and version created assets. + +As some of these these are dynamic fields tools like Terraform and NACK will need to understand to ignore these fields +when doing their remediation loops. + +# Offline Assets + +Today when an asset cannot be loaded it's simply not loaded. But to improve compatability, user reporting and +discovery we want to support a mode where a stream is visible in Stream reports but marked as offline with a reason. + +An offline stream should still be reporting, responding to info and more but no messages should be accepted into it +and no messages should be delivered to any consumers, messages can't be deleted, configuration cannot be updated - +it is offline in every way that would result in a change. Likewise a compatible offline mode should exist for Consumers. + +The Stream and Consumer state should get new fields `Offline bool` and `OfflineReason string` that should be set for +such assets. + +When the server starts and determines it cannot start an asset for any reason, due to error or required API +Level, it should set this mode and fields. + +For starting incompatible streams in offline mode we would need to load the config in the current manner to figure out +which subjects Streams would listen on since even while Streams are offline we do need the protections of +overlapping subjects to be active to avoid issues later when the Stream can come online again. + +# Safe unmarshalling of JSON data + +The JetStream API and Meta-layer should start using the `DisallowUnknownFields` feature in the go json package and +detect when asked to load incompatible assets or serve incompatible API calls and should error in the case of the +API and start assets in Offline mode in the case of assets. + +This will prevent assets inadvertently reverting some settings and changing behaviour during downgrades. + +A POC branch against 2.11 main identified only 1 test failure after changing all JSON Unmarshall calls and this was a +legit bug in a test. + +One possible approach that can be introduced in 2.11 is to already perform strict unmarshalling in all cases but when a +issue is detected we would log the error and then do a normal unmarshal to remain compatible. A configuration option +should exist to turn this into a fatal error rather than a logged warning only. + +It could also be desirable to allow a header in the API requests to signal strict unmarshalling should be fatal for a +specific API call in order to facilitate testing. + +# Minimal supported API level for assets + +When the server loads assets it should detect incompatible features using a combination of `DisallowUnknownFields` +and comparing the server API levels to those required by the asset. + +Incompatible assets should be loaded in Offline mode and an advisory should be published. + +# Concerns + +The problem with this approach is that we only know if something worked, or was compatible with the server, after the +asset was created. In cases where a feature adds new configuration fields this would be easily detected by the +Marshaling but on more subtle features like previous changes to `DiscardNew` the client would need to verify the +response to ensure the version requirements were met and then remove the bad asset, this is a pretty bad error +handling scenario. + +This can be slightly mitigated by including the current server version in the `$JS.API.INFO` response so at least +the meta leader version is known - but in reality one cannot really tell a lot from this version since it is not +guaranteed to be representative of the cluster and is particularly problematic in long running clients as the server +may have been upgraded since last `INFO` call. + +In the future we could have clients assert that a certain API call requires a certain API Level but it was felt that +today that would not be feasible to do given the amount of clients and their current state. + +# Implementation Targets + +For 2.11 we should: + + * calculate and report the api support level + * start reporting the metadata + * soft unmarshalling failures with an option to enable fatal failures + +For future versions we can round the feature set out with the offline features and more. diff --git a/adr/ADR-8.md b/adr/ADR-8.md index 61004479..bc432cd6 100644 --- a/adr/ADR-8.md +++ b/adr/ADR-8.md @@ -1,18 +1,30 @@ # JetStream based Key-Value Stores -|Metadata|Value| -|--------|-----| -|Date |2021-06-30| -|Author |@ripienaar| -|Status |Implemented| -|Tags |jetstream, client, kv| +| Metadata | Value | +|----------|-----------------------------| +| Date | 2021-06-30 | +| Author | @ripienaar | +| Status | Implemented | +| Tags | jetstream, client, kv, spec | + +## Release History + +| Revision | Date | Description | +|----------|------------|-----------------------------------------------------| +| 1 | 2021-12-15 | Initial stable release of version 1.0 specification | +| 2 | 2023-10-16 | Document NATS Server 2.10 sourced buckets | +| 2 | 2023-10-16 | Document read replica mirrors buckets | +| 2 | 2023-10-16 | Document consistency guarantees | +| 3 | 2023-10-19 | Formalize initial bucket topologies | +| 4 | 2023-10-25 | Support compression | +| 5 | 2024-06-05 | Add KV management | +| 6 | 2024-06-05 | Add Keys listers with filters | -## Context -This document describes a design and initial implementation of a JetStream backed key-value store. The initial implementation -is available in the CLI as `nats kv` with the reference client implementation being the `nats.go` repository. +## Context -This document aims to guide client developers in implementing this feature in the language clients we maintain. +This document describes a design and initial implementation of a JetStream backed key-value store. All tier-1 clients +support KV. ## Status and Roadmap @@ -38,19 +50,22 @@ additional behaviors will come during the 1.x cycle. * Key starting with `_kv` is reserved for internal use * CLI tool to manage the system as part of `nats`, compatible with client implementations * Accept arbitrary application prefixes, as outlined in [ADR-19](https://github.com/nats-io/nats-architecture-and-design/blob/main/adr/ADR-19.md) + * Data Compression for NATS Server 2.10 ### 1.1 - * Encoders and Decoders for keys and values - * Additional Operation that indicates server limits management deleted messages + * Merged buckets using NATS Server 2.10 subject transforms + * Read replicas facilitated by Stream Mirrors + * Replica auto discovery for mirror based replicas + * Formalized Multi-cluster and Leafnode Topologies ### 1.2 - * Read replicas facilitated by Stream Mirrors * Read-only operation mode - * Replica auto discovery * Read cache against with replica support * Ranged operations + * Encoders and Decoders for keys and values + * Additional Operation that indicates server limits management deleted messages ### 1.3 @@ -110,9 +125,22 @@ type Status interface { // TTL is how long the bucket keeps values for TTL() time.Duration - // Keys return a list of all keys in the bucket - not possible now except in caches + // Keys return a list of all keys in the bucket. + // Historically this method returned a complete slice of all keys in the bucket, + // however clients should return interable result. Keys() ([]string, error) + // KeysWithFilters returns a filtered list of keys in the bucket. + // Historically this method returned a complete slice of all keys in the bucket, + // however clients should return interable result. + // Languages can implement the list of filters in most idiomatic way - as an iterator, variadic argument, slice, etc. + // When multiple filters are passed, client library should check `consumer info` from `consumer create method` if the filters are matching, + // as nats-server < 2.10 would ignore them. + KeysWithFilters(filter []string) ([]string, error) + + // IsCompressed indicates if the data is compressed on disk + IsCompressed() bool + // BackingStore is a name indicating the kind of backend BackingStore() string @@ -127,14 +155,17 @@ Languages can choose to expose additional information about the bucket along wit the `Status` interface is above but the `JetStream` specific implementation can be cast to gain access to `StreamInfo()` for full access to JetStream state. +The choice of `IsCompressed()` as a method name is idiomatic for Go, language maintainers can make a similar idiomatic +choice. + Other languages do not have a clear 1:1 match of the above idea so maintainers are free to do something idiomatic. ## RoKV **NOTE:** Out of scope for version 1.0 -This is a read-only KV store handle, I call this out here to demonstrate that we need to be sure to support a read-only -variant of the client. One that will only function against a read replica and cannot support `Put()` etc. +This is a read-only KV store handle, I call this out here to demonstrate that we need to be sure to support a read-only +variant of the client. One that will only function against a read replica and cannot support `Put()` etc. That capability is important, how you implement this in your language is your choice. You can throw exceptions on `Put()` when read-only or whatever you like. @@ -165,6 +196,8 @@ type RoKV interface { } ``` +Regarding `Keys`, optionally the client can provide a method that provides the keys in an iterable or consumable form. + ## KV This is the read-write KV store handle, every backend should implement a language equivalent interface. But note the comments @@ -195,6 +228,48 @@ type KV interface { } ``` +## KV Management + +This is set of operations on the KV buckets from the JetStream context. + +```go +// KeyValueManager is used to manage KeyValue buckets. It provides methods to +// create, delete, and retrieve. +type KeyValueManager interface { + // KeyValue will lookup and bind to an existing KeyValue bucket. + // Name can be `get_key_value`, or whatever name is idiomatic in given language. + KeyValue(ctx context.Context, bucket string) (KeyValue, error) + + // CreateKeyValue will create a KeyValue bucket with the given + // configuration. + CreateKeyValue(ctx context.Context, cfg KeyValueConfig) (KeyValue, error) + + // UpdateKeyValue will update an existing KeyValue bucket with the given + // configuration. + UpdateKeyValue(ctx context.Context, cfg KeyValueConfig) (KeyValue, error) + + // CreateOrUpdateKeyValue will create a KeyValue bucket if it does not + // exist or update an existing KeyValue bucket with the given + // configuration (if possible). + CreateOrUpdateKeyValue(ctx context.Context, cfg KeyValueConfig) (KeyValue, error) + + // DeleteKeyValue will delete given KeyValue bucket. + DeleteKeyValue(ctx context.Context, bucket string) error + + // KeyValueBucketNames is used to retrieve a list of key value bucket + // names. The KeyValueNamesLister should behave in a similar fashion + // to the language implementation of Get Stream Names. If not already some sort of iterable, + // an iterable form of the api is acceptable as well. + KeyValueBucketNames(ctx context.Context) KeyValueNamesLister + + // KeyValueBuckets is used to retrieve a list of key value bucket + // statuses. The KeyValueNamesLister should behave in a similar fashion + // to the language implementation of Get Stream Infos. If not already some sort of iterable, + // an iterable form of the api is acceptable as well. + KeyValueBuckets(ctx context.Context) KeyValueStatusLister +} +``` + ## Storage Backends We do have plans to support, and provide, commercial KV as part of our NGS offering, however there will be value in an @@ -210,19 +285,28 @@ later. The features to support KV is in NATS Server 2.6.0. +#### Consistency Guarantees + +By default, we do not provide read-after-write consistency. Reads are performed directly to any replica, including out +of date ones. If those replicas do not catch up multiple reads of the same key can give different values between +reads. If the cluster is healthy and performing well most reads would result in consistent values, but this should not +be relied on to be true. + +If `allow_direct` is disabled on a bucket configuration read-after-write consistency is achieved at the expense of +performance. It is then also not possible to use the mirror based read replicas. + #### Buckets A bucket is a Stream with these properties: * The main write bucket must be called `KV_` * The 'ingest' subjects must be `$KV..>` - * The bucket history or 'max history per key' is achieved by setting `max_msgs_per_subject` to the desired history level. + * The bucket history or 'max history per key' is achieved by setting `max_msgs_per_subject` to the desired history level. * The maximum allowed size is 64. - * The minimum allowed size is 1. When creating a stream, 1 should be used when the user does not supply a value. + * The minimum allowed size is 1. When creating a stream, 1 should be used when the user does not supply a value. * Safe key purges that deletes history requires rollup to be enabled for the stream using `rollup_hdrs` * Write replicas are File backed and can have a varying R value * Key TTL is managed using the `max_age` key - * Duplicate window must be same as `max_age` when `max_age` is less than 2 minutes * Maximum value sizes can be capped using `max_msg_size` * Maximum number of keys cannot currently be limited * Overall bucket size can be limited using `max_bytes` @@ -232,8 +316,9 @@ A bucket is a Stream with these properties: * Allow Direct is always set to `true`. (It can be modified out-of-band only if desired, but not through KV bucket update.) * Placement is allowed * Republish is allowed + * If compression is requested in the configuration set `compression` to `s2` -Here is a full example of the `CONFIGURATION` bucket: +Here is a full example of the `CONFIGURATION` bucket with compression enabled: ```json { @@ -251,10 +336,10 @@ Here is a full example of the `CONFIGURATION` bucket: "storage": "file", "discard": "new", "num_replicas": 1, - "duplicate_window": 120000000000, "rollup_hdrs": true, "deny_delete": true, "allow_direct": true, + "compression": "s2", "placement": { "cluster": "clstr", "tags": ["tag1", "tag2"] @@ -267,6 +352,9 @@ Here is a full example of the `CONFIGURATION` bucket: } ``` +Note: Previous revisions of this document noted that "Duplicate window must be same as `max_age` when `max_age` is less than 2 minutes". +This behavior requires no code on the client. As long as `duplicate_window` is not supplied in the configuration, the server will supply this logic. + #### Storing Values Writing a key to the bucket is a basic JetStream request. @@ -274,11 +362,11 @@ Writing a key to the bucket is a basic JetStream request. The KV key `auth.username` in the `CONFIGURATION` bucket is written sent, using a request, to `$KV.CONFIGURATION.auth.username`. To implement the feature that would accept a write only if the revision of the current value of a key has a specific revision -we use the new `Nats-Expected-Last-Subject-Sequence` header. The special value `0` for this header would indicate that the message -should only be accepted if it's the first message on a subject. This is purge aware, ie. if a value is in and the subject is purged +we use the new `Nats-Expected-Last-Subject-Sequence` header. The special value `0` for this header would indicate that the message +should only be accepted if it's the first message on a subject. This is purge aware, ie. if a value is in and the subject is purged again a `0` value will be accepted. -This can be implemented as a `PutOption` ie. `Put("x.y", val, UpdatesRevision(10))`, `Put("x.y", val, MustCreate())` or +This can be implemented as a `PutOption` ie. `Put("x.y", val, UpdatesRevision(10))`, `Put("x.y", val, MustCreate())` or by adding the `Create()` and `Update()` helpers, or both. Other options might be `UpdatesEntry(e)`, language implementations can add what makes sense in addition. @@ -293,14 +381,21 @@ There are different situations where messages will be retrieved using different In all cases we return a generic `Entry` type. Deleted data - (see later section on deletes) - has the `KV-Operation` header set to `DEL` or `PURGE`, really any value other than unset -- a value received from either of these methods with this header set indicates the data has been deleted. A delete operation is turned -into a `key not found` error in basic gets and into a `Entry` with the correct operation value set in watchers or history. +- a value received from either of these methods with this header set indicates the data has been deleted. A delete operation is turned +into a `key not found` error in basic gets and into a `Entry` with the correct operation value set in watchers or history. ##### Get Operation +###### When allow_direct is false + We have extended the `io.nats.jetstream.api.v1.stream_msg_get_request` API to support loading the latest value for a specific -subject. Thus a read for `CONFIGURATION.username` becomes a `io.nats.jetstream.api.v1.stream_msg_get_request` with the -`last_by_subj` set to `$KV.CONFIGURATION.auth.username`. +subject. Thus, a read for `$KV.CONFIGURATION.username` becomes a `io.nats.jetstream.api.v1.stream_msg_get_request` +with the `last_by_subj` set to `$KV.CONFIGURATION.auth.username`. + +###### When allow_direct is true + +We have introduced a new direct API that allows retrieving the last message for a subject via `$JS.APIDIRECT.GET. +.`. This should be used for performing all gets on a bucket if direct is enabled. ##### History @@ -310,9 +405,9 @@ and read the entire value list in using `deliver_all`. Use an Ordered Consumer t JetStream will report the Pending count for each message, the latest value from the available history would have a pending of `0`. When constructing historic values, dumping all values etc we ensure to only return pending 0 messages as the final value -##### Watch +##### Watch -A watch, like History, is based on ephemeral consumers reading values using Ordered Consumers, but now we start with the new +A watch, like History, is based on ephemeral consumers reading values using Ordered Consumers, but now we start with the new `last_per_subject` initial start, this means we will get all matching latest values for all keys. Watch can take options to allow including history, sending only new updates or sending headers only. Using a Watch end users @@ -331,7 +426,7 @@ return no key error etc. Purge is like delete but history is not preserved. This is achieved by publishing a message in the same manner as Delete using the `KV-Operation: PURGE` header but adding the header `Nats-Rollup: sub` in addition. -This will instruct the server to place the purge operation message in the stream and then delete all messages for that key up to +This will instruct the server to place the purge operation message in the stream and then delete all messages for that key up to before the delete operation. #### List of known keys @@ -349,26 +444,230 @@ Remove the stream entirely. Watchers support sending received `PUT`, `DEL` and `PURGE` operations across a channel or language specific equivalent. Watchers support accepting simple keys or ranges, for example watching on `auth.username` will get just operations on that key, -but watching `auth.>` will get operations for everything below `auth.`, the entire bucket can be watched using an empty key or a +but watching `auth.>` will get operations for everything below `auth.`, the entire bucket can be watched using an empty key or a key with wildcard `>`. We need to signal when we reach the end of the initial data set to facilitate use cases such as dumping a bucket, iterating keys etc. Languages can implement an End Of Initial Data signal in a language idiomatic manner. Internal to the watcher you reach this state the -first time any message has a `Pending==0`. This signal must also be sent if no data is present - either by checking for messages using +first time any message has a `Pending==0`. This signal must also be sent if no data is present - either by checking for messages using `GetLastMsg()` on the watcher range or by inspecting the Pending+Delivered after creating the consumer. The signal must always be sent. Whatchers should support at least the following options. Languages can choose to support more models if they wish, as long as that is clearly indicated as a language specific extension. Names should be language idiomatic but close to these below. -|Name|Description| -|----|-----------| -|`IncludeHistory`|Send all available history rather than just the latest entries| -|`IgnoreDeletes`|Only sends `PUT` operation entries| -|`MetaOnly`|Does not send any values, only metadata about those values| -|`UpdatesOnly`|Only sends new updates made, no current or historical values are sent. The End Of Initial Data marker is sent as soon as the watch starts.| +| Name | Description | +|------------------|--------------------------------------------------------------------------------------------------------------------------------------------| +| `IncludeHistory` | Send all available history rather than just the latest entries | +| `IgnoreDeletes` | Only sends `PUT` operation entries | +| `MetaOnly` | Does not send any values, only metadata about those values | +| `UpdatesOnly` | Only sends new updates made, no current or historical values are sent. The End Of Initial Data marker is sent as soon as the watch starts. | The default behavior with no options set is to send all the `last_per_subject` values, including delete/purge operations. +#### Multi-Cluster and Leafnode topologies + +A bucket, being backed by a Stream, lives in one Cluster only. To make buckets available elsewhere we have to use +JetStream Sources and Mirrors. + +In KV we call these `Toplogies` and adding *Topology Buckets* require using different APIs than the main Bucket API +allowing us to codify patterns and options that we support at a higher level than the underlying Stream options. + +For example, we want to be able to expose a single boolean that says an Aggregate is read-only which would potentially +influence numerous options in the Stream Configuration. + +![KV Topologies](images/0008-topologies.png) + +To better communicate the intent than the word Source we will use `Aggregate` in KV terms: + + **Mirror**: Copy of exactly 1 other bucket. Used primarily for scaling out the `Get()` operations. + + * It is always Read-Only + * It can hold a filtered subset of keys + * Replicas are automatically picked using a RTT-nearest algorithm without any configuration + * Additional replicas can be added and removed at run-time without any re-configuration of already running KV clients + * Writes and Watchers are transparently sent to the origin bucket + * Can replicate buckets from other accounts and domains + +**Aggregate**: A `Source` that combines one or many buckets into 1 new bucket. Used to provide a full local copy of +other buckets that support watchers and gets locally in edge scenarios. + + * Requires being accessed specifically by its name used in a `KeyValue()` call + * Can be read-only or read-write + * It can hold a subset of keys from the origin buckets to limit data exposure or size + * Can host watchers + * Writes are not transparently sent to the origin Bucket as with Replicas, they either fail (default) or succeed and + modify the Aggregate (opt-in) + * Can combine buckets from multiple other accounts and domains into a single Aggregate + * Additional Sources can be added after initially creating the Aggregate + +Experiments: + +These items we will add in future iterations of the Topology concept: + + * Existing Sources can be removed from an Aggregate. Optionally, but by default, purge the data out of the bucket + for the Source being removed + * Watchers could be supported against a Replica and would support auto-discovery of nearest replica but would + minimise the ability to add and remove Replicas at runtime + +*Implementation Note*: While this says Domains are supported, we might decide not to implement support for them at +this point as we know we will revisit the concept of a domain. The existing domain based mirrors that are supported +in KeyValueConfig will be deprecated but supported for the foreseeable future for those requiring domain support. + +#### Creation of Aggregates + +Since NATS Server 2.10 we support transforming messages as a stream configuration item. This allows us to source one +bucket from another and rewrite the keys in the new bucket to have the correct name. + +We will model this using a few API functions and specific structures: + +```go +// KVAggregateConfig configures an aggregate +// +// This one is quite complex because are buckets in their own right and so inevitably need +// to have all the options that are in buckets today (minus the deprecated ones). +type KVAggregateConfig struct { + Bucket string + Writable bool + Description string + Replicas int + MaxValueSize int32 + History uint8 + TTL time.Duration + MaxBytes int64 + Storage KVStorageType // a new kv specific storage struct, for now identical to normal one + Placement *KVPlacement // a new kv specific placement struct, for now identical to normal one + RePublish *KVRePublish // a new kv specific replacement struct, for now identical to normal one + Origins []*KVAggregateOrigin +} + +type KVAggregateOrigin struct { + Stream string // note this is Stream and not Bucket since the origin may be a mirror which may not be a bucket + Bucket string // in the case where we are aggregating from a mirror, we need to know the bucket name to construct mappings + Keys []string // optional filter defaults to > + External *ExternalStream +} + +// CreateAggregate creates a new read-only Aggregate bucket with one or more sources +CreateAggregate(ctx context.Context, cfg KVAggregateOrigin) (KeyValue, error) {} + +// AddAggregateOrigin updates bucket by adding new origin cfg, errors if bucket is not an Aggregate +AddAggregateOrigin(ctx context.Context, bucket KeyValue, cfg KVAggregateOrigin) error {} +``` + +To copy the keys `NEW.>` from bucket `ORDERS` into `NEW_ORDERS`: + +```go +bucket, _ := CreateAggregate(ctx, KVAggregateConfig{ + Name: "NEW_ORDERS", + Writable: false, + Origins: []KVAggregateOrigin{ + { + Stream: "KV_ORDERS", + Keys: []string{"NEW.>"} + } + } +}) +``` + +We create the new stream with the following partial config, rest as per any other KV, if the `orders` handle : + +```json + "subjects": []string{}, + "deny_delete": true, + "deny_purge": true, + "sources": [ + { + "name": "KV_ORDERS", + "subject_transforms": [ + { + "src": "$KV.ORDERS.NEW.>", + "dest": "$KV.NEW_ORDERS.>" + } + ] + } + ], +``` + +When writable, configure as normal just add the sources. + +This results in all messages from `ORDERS` keys `NEW.>` to be copied into `NEW_ORDERS` and the subjects rewritten on +write to the new bucket so that a unmodified KV client on `NEW_ORDERS` would just work. + +#### Creation of Mirrors + +Replicas can be built using the standard mirror feature by setting `mirror_direct` to true as long as the origin bucket +also has `allow_direct`. When adding a mirror it should be confirmed that the origin bucket has `allow_direct` set. + +We will model this using a few API functions and specific structures: + +```go +type KVMirrorConfig struct { + Name string // name, not bucket, as this may not be accessed as a bucket + Description string + Replicas int + History uint8 + TTL time.Duration + MaxBytes int64 + Storage StorageType + Placement *Placement + + Keys []string // mirrors only subsets of keys + OriginBucket string + External *External +} + +// CreateMirror creates a new read-only Mirror bucket from an origin bucket +CreateMirror(ctx context.Context, cfg KVMirrorConfig) error {} +``` + +These mirrors are not called `Bucket` and may not have the `KV_` string name prefix as they are not buckets and cannot +be used as buckets without significant changes in how a KV client constructs its key names etc, we have done this in +the leafnode mode and decided it's not a good pattern. + +When creating a replica of `ORDERS` to `MIRROR_ORDERS_NYC` we do: + +```go +err := CreateMirror(ctx, origin, KVMirrorConfig{ + Name: "MIRROR_ORDERS_NYC", + // ... + OriginStream: "ORDERS" +}) +``` + +When a direct read is done the response will be from the rtt-nearest mirror. With a mirror added the `nats` command +can be used to verify that a alternative location is set: + +``` +$ nats s info KV_ORDERS +... +State: + + Alternates: MIRROR_ORDERS_NYC: Cluster: nyc Domain: hub + KV_ORDERS: Cluster: lon Domain: hub + +``` + +Here we see a RTT-sorted list of alternatives, the `MIRROR_ORDERS_NYC` is nearest to me in the RTT sorted list. + +When doing a direct get the headers will confirm the mirror served the request: + +``` +$ nats req '$JS.API.DIRECT.GET.KV_ORDERS.$KV.ORDERS.NEW.123' '' +13:26:06 Sending request on "JS.API.DIRECT.GET.KV_ORDERS.$KV.ORDERS.NEW.123" +13:26:06 Received with rtt 1.319085ms +13:26:06 Nats-Stream: MIRROR_ORDERS_NYC +13:26:06 Nats-Subject: $KV.ORDERS.NEW.123 +13:26:06 Nats-Sequence: 12 +13:26:06 Nats-Time-Stamp: 2023-10-16T12:54:19.409051084Z + +{......} +``` + +As mirrors support subject filters these replicas can hold region specific keys. + +As this is a `Mirror` this stream does not listen on a subject and so the only way to get data into it is via the origin +bucket. We should also set the options to deny deletes and purges. + #### API Design notes The API here represents a minimum, languages can add local flavour to the API - for example one can add `PutUint64()` and `GetUint64()` @@ -422,4 +721,3 @@ ditto for service registeries and so forth. On the name `Entry` for the returned result. `Value` seemed a bit generic and I didn't want to confuse matters mainly in the go client that has the unfortunate design of just shoving everything and the kitchen sink into a single package. `KVValue` is a stutter and so settled on `Entry`. - diff --git a/adr/images/0008-topologies.png b/adr/images/0008-topologies.png new file mode 100644 index 00000000..44f523fa Binary files /dev/null and b/adr/images/0008-topologies.png differ diff --git a/adr/images/stream-transform.png b/adr/images/stream-transform.png index 1f0722e1..a265cee6 100644 Binary files a/adr/images/stream-transform.png and b/adr/images/stream-transform.png differ diff --git a/go.mod b/go.mod index c79b940a..56181aad 100644 --- a/go.mod +++ b/go.mod @@ -1,5 +1,13 @@ module github.com/nats-io/nats-architecture-and-design -go 1.16 +go 1.22 -require gitlab.com/golang-commonmark/markdown v0.0.0-20191127184510-91b5b3c99c19 +require gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a + +require ( + gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 // indirect + gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3 // indirect + gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 // indirect + gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f // indirect + golang.org/x/text v0.16.0 // indirect +) diff --git a/go.sum b/go.sum index 863a4813..cc92d7e7 100644 --- a/go.sum +++ b/go.sum @@ -1,19 +1,19 @@ -github.com/russross/blackfriday v2.0.0+incompatible h1:cBXrhZNUf9C+La9/YpS+UHpUT8YD6Td9ZMSU9APFcsk= -github.com/russross/blackfriday v2.0.0+incompatible/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g= -github.com/shurcooL/sanitized_anchor_name v1.0.0 h1:PdmoCO6wvbs+7yrJyMORt4/BmY5IYyJwS/kOiWx8mHo= -github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc= +github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk= +github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 h1:K+bMSIx9A7mLES1rtG+qKduLIXq40DAzYHtb0XuCukA= gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181/go.mod h1:dzYhVIwWCtzPAa4QP98wfB9+mzt33MSmM8wsKiMi2ow= -gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82 h1:oYrL81N608MLZhma3ruL8qTM4xcpYECGut8KSxRY59g= gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82/go.mod h1:Gn+LZmCrhPECMD3SOKlE+BOHwhOYD9j7WT9NUtkCrC8= -gitlab.com/golang-commonmark/markdown v0.0.0-20191127184510-91b5b3c99c19 h1:HsZm6XaTpEgZiZqcXZkUbG6BNtSZE3XyCTfo52YBoDY= -gitlab.com/golang-commonmark/markdown v0.0.0-20191127184510-91b5b3c99c19/go.mod h1:CRIzp0wh6PvKEAeEOtp9wEpNKJJ1VFTNfHO4+ToRgVA= +gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3 h1:1Coh5BsUBlXoEJmIEaNzVAWrtg9k7/eJzailMQr1grw= +gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3/go.mod h1:Gn+LZmCrhPECMD3SOKlE+BOHwhOYD9j7WT9NUtkCrC8= +gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a h1:O85GKETcmnCNAfv4Aym9tepU8OE0NmcZNqPlXcsBKBs= +gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a/go.mod h1:LaSIs30YPGs1H5jwGgPhLzc8vkNc/k0rDX/fEZqiU/M= gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 h1:qqjvoVXdWIcZCLPMlzgA7P9FZWdPGPvP/l3ef8GzV6o= gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84/go.mod h1:IJZ+fdMvbW2qW6htJx7sLJ04FEs4Ldl/MDsJtMKywfw= gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f h1:Wku8eEdeJqIOFHtrfkYUByc4bCaTeA6fL0UJgfEiFMI= gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f/go.mod h1:Tiuhl+njh/JIg0uS/sOJVYi0x2HEa5rc1OAaVsb5tAs= gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638 h1:uPZaMiz6Sz0PZs3IZJWpU5qHKGNy///1pacZC9txiUI= gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638/go.mod h1:EGRJaqe2eO9XGmFtQCvV3Lm9NLico3UhFwUpCG/+mVU= -golang.org/x/text v0.3.2 h1:tW2bmiBqwgJj/UpqtC8EpXEZVYOwU0yG4iWbprSVAcs= golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk= +golang.org/x/text v0.16.0 h1:a94ExnEXNtEwYLGJSIUxnWoxoRz/ZcCsV63ROupILh4= +golang.org/x/text v0.16.0/go.mod h1:GhwF1Be+LQoKShO3cGOHzqOgRrGaYc9AvblQOmPVHnI= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= diff --git a/main.go b/main.go index ec4b351b..1b1064cf 100644 --- a/main.go +++ b/main.go @@ -28,7 +28,7 @@ type ADR struct { } var ( - validStatus = []string{"Approved", "Partially Implemented", "Implemented"} + validStatus = []string{"Approved", "Partially Implemented", "Implemented", "Deprecated"} ) func parseCommaList(l string) []string { @@ -172,10 +172,18 @@ func renderIndexes(adrs []*ADR) error { } tagsList := []string{} + hasDeprecated := false for k := range tags { + if k == "deprecated" { + hasDeprecated = true + continue + } tagsList = append(tagsList, k) } sort.Strings(tagsList) + if hasDeprecated { + tagsList = append(tagsList, "deprecated") + } type tagAdrs struct { Tag string