Skip to content

Commit

Permalink
clarify the meaning of earliest and latest (#2647)
Browse files Browse the repository at this point in the history
Signed-off-by: Richard Chien <[email protected]>
  • Loading branch information
stdrc authored Sep 30, 2024
1 parent 60bbfd7 commit c0a2fcc
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 4 deletions.
2 changes: 1 addition & 1 deletion docs/ingest/ingest-from-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ For tables with primary key constraints, if a new data record with an existing k
|---|---|
|topic| Required. Address of the Kafka topic. One source can only correspond to one topic.|
|properties.bootstrap.server| Required. Address of the Kafka broker. Format: `'ip:port,ip:port'`. |
|scan.startup.mode|Optional. The offset mode that RisingWave will use to consume data. The two supported modes are `earliest` (earliest offset) and `latest` (latest offset). If not specified, the default value `earliest` will be used.|
|scan.startup.mode|Optional. The offset mode that RisingWave will use to consume data. The two supported modes are `earliest` (read from low watermark) and `latest` (read from high watermark). If not specified, the default value `earliest` will be used.|
|scan.startup.timestamp.millis|Optional. RisingWave will start to consume data from the specified UNIX timestamp (milliseconds). If this field is specified, the value for `scan.startup.mode` will be ignored.|
|group.id.prefix | Optional. Specify a custom group ID prefix for the source. The default prefix is `rw-consumer`. Each job (materialized view) will have a separate consumer group with a generated suffix in the group ID, so the format of the consumer group is `{group_id_prefix}-{fragment_id}`. This is used to monitor progress in external Kafka tools and for authorization purposes. RisingWave does not rely on committed offsets or join the consumer group. It only reports offsets to the group.|
|properties.sync.call.timeout | Optional. Specify the timeout. By default, the timeout is 5 seconds. |
Expand Down
6 changes: 4 additions & 2 deletions docs/ingest/ingest-from-kinesis.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ When creating a source, you can choose to persist the data from the source in Ri
## Syntax

```sql
CREATE {TABLE | SOURCE} [ IF NOT EXISTS ] source_name
CREATE {TABLE | SOURCE} [ IF NOT EXISTS ] source_name
[ schema_definition ]
[INCLUDE { header | key | offset | partition | timestamp } [AS <column_name>]]
WITH (
Expand Down Expand Up @@ -62,9 +62,11 @@ For a table with primary key constraints, if a new data record with an existing
|aws.credentials.session_token |Optional. The session token associated with the temporary security credentials. Using this field is not recommended as RisingWave contains long-running jobs and the token may expire. Creating a [new role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_aws-accounts.html) is preferred.|
|aws.credentials.role.arn |Optional. The Amazon Resource Name (ARN) of the role to assume.|
|aws.credentials.role.external_id|Optional. The [external id](https://aws.amazon.com/blogs/security/how-to-use-external-id-when-granting-access-to-your-aws-resources/) used to authorize access to third-party resources. |
|scan.startup.mode |Optional. The startup mode for Kinesis consumer. Supported modes: `earliest` (starts from the earliest offset), `latest` (starts from the latest offset), and `timestamp` (starts from a specific timestamp, specified by `scan.startup.timestamp.millis`). The default mode is `earliest`.|
|scan.startup.mode |Optional. The startup mode for Kinesis consumer. Supported modes: `earliest` (corresponding to [starting position] `TRIM_HORIZON`), `latest` (corresponding to [starting position] `LATEST`), and `timestamp` (starts from a specific timestamp specified by `scan.startup.timestamp.millis`, corresponding to [starting position] `AT_TIMESTAMP`). The default mode is `earliest`.|
|scan.startup.timestamp.millis |Optional. This field specifies the timestamp, represented in i64, to start consuming from. |

[starting position]: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_StartingPosition.html

### Other parameters

|Field| Notes|
Expand Down
4 changes: 3 additions & 1 deletion docs/ingest/ingest-from-nats.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,10 +95,12 @@ According to the [NATS documentation](https://docs.nats.io/running-a-nats-servic
|`connect_mode`|Required. Authentication mode for the connection. Allowed values: <ul><li>`plain`: No authentication. </li><li>`user_and_password`: Use user name and password for authentication. For this option, `username` and `password` must be specified.</li><li> `credential`: Use JSON Web Token (JWT) and NKeys for authentication. For this option, `jwt` and `nkey` must be specified.</li></ul> |
|`jwt` and `nkey`|JWT and NKEY for authentication. For details, see [JWT](https://docs.nats.io/running-a-nats-service/configuration/securing_nats/auth_intro/jwt) and [NKeys](https://docs.nats.io/running-a-nats-service/configuration/securing_nats/auth_intro/nkey_auth).|
|`username` and `password`| Conditional. The client user name and password. Required when `connect_mode` is `user_and_password`.|
|`scan.startup.mode`|Optional. The offset mode that RisingWave will use to consume data. The supported modes are: <ul><li>`earliest`: Consume data from the earliest offset.</li><li>`latest`: Consume data from the latest offset.</li><li>`timestamp`: Consume data from a particular UNIX timestamp, which is specified via `scan.startup.timestamp.millis`.</li></ul>If not specified, the default value `earliest` will be used.|
|`scan.startup.mode`|Optional. The offset mode that RisingWave will use to consume data. The supported modes are: <ul><li>`earliest`: Consume from the earliest available message, corresponding to [deliver policy] `DeliverAll`.</li><li>`latest`: Consume from the next message, corresponding to `DeliverNew` policy.</li><li>`timestamp`: Consume from a particular UNIX timestamp specified via `scan.startup.timestamp.millis`, corresponding to `DeliverByStartTime` policy.</li></ul>If not specified, the default value `earliest` will be used.|
|`scan.startup.timestamp.millis`|Conditional. Required when `scan.startup.mode` is `timestamp`. RisingWave will start to consume data from the specified UNIX timestamp.
|`data_encode`| Supported encodes: `JSON`, `PROTOBUF`, `BYTES`. |

[deliver policy]: https://docs.nats.io/nats-concepts/jetstream/consumers#deliverpolicy

## Examples

The following SQL query creates a table that ingests data from a NATS JetStream source.
Expand Down

0 comments on commit c0a2fcc

Please sign in to comment.