Skip to content

Commit

Permalink
updates, summary
Browse files Browse the repository at this point in the history
  • Loading branch information
hariso committed May 16, 2024
1 parent b2a62fc commit 68479f6
Showing 1 changed file with 32 additions and 56 deletions.
88 changes: 32 additions & 56 deletions docs/design-documents/20240430-schema-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@
* [Requirements](#requirements)

Check failure on line 5 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list style

docs/design-documents/20240430-schema-support.md:5:1 MD004/ul-style Unordered list style [Expected: dash; Actual: asterisk] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md004.md

Check failure on line 5 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list indentation

docs/design-documents/20240430-schema-support.md:5:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md007.md
* [Schema structure](#schema-structure)

Check failure on line 6 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list style

docs/design-documents/20240430-schema-support.md:6:1 MD004/ul-style Unordered list style [Expected: dash; Actual: asterisk] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md004.md

Check failure on line 6 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list indentation

docs/design-documents/20240430-schema-support.md:6:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md007.md
* [Schema operations](#schema-operations)

Check failure on line 7 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list style

docs/design-documents/20240430-schema-support.md:7:1 MD004/ul-style Unordered list style [Expected: dash; Actual: asterisk] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md004.md

Check failure on line 7 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list indentation

docs/design-documents/20240430-schema-support.md:7:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md007.md
* [Create](#create)
* [Fetch](#fetch)
* [Implementation](#implementation)

Check failure on line 8 in docs/design-documents/20240430-schema-support.md

View workflow job for this annotation

GitHub Actions / markdownlint-cli2

Unordered list style

docs/design-documents/20240430-schema-support.md:8:1 MD004/ul-style Unordered list style [Expected: dash; Actual: asterisk] https://github.com/DavidAnson/markdownlint/blob/v0.34.0/doc/md004.md
* [Schema storage](#schema-storage)
* [Option 1: Conduit itself hosts the schema registry](#option-1-conduit-itself-hosts-the-schema-registry)
Expand All @@ -24,9 +22,9 @@
* [Chosen option](#chosen-option-2)
* [Required changes](#required-changes)
* [Conduit](#conduit)
* [Conduit Commons](#conduit-commons)
* [Connector SDK](#connector-sdk)
* [Processor SDK](#processor-sdk)
* [How are requirements addressed](#how-are-requirements-addressed)
* [Summary](#summary)
* [Other considerations](#other-considerations)
<!-- TOC -->
Expand Down Expand Up @@ -107,18 +105,10 @@ optional/nullable is better developer experience.

## Schema operations

### Create
The required schema operations are:

**Request**: A request to create a schema contains the schema name and at least
one field.

**Response**: If successful, the new schema's ID is returned. If a schema with
the same name and field specs exists, no new schema is created, and the existing
schema's ID is returned.

### Fetch

The only type of fetch supported is fetch by ID.
1. create (using a name and list of fields)
2. fetch (using a schema ID)

## Implementation

Expand Down Expand Up @@ -365,65 +355,51 @@ for remote Conduit instances.

### Conduit

Conduit needs to expose a gRPC service as explained above. The gRPC service
exposes methods needed for connectors to work with schemas.
Conduit needs to expose a gRPC schema service as explained above. The gRPC
service exposes methods needed for connectors to work with schemas. Initially,
the service will use the Apicurio Registry to actually manage the schemas. Later
on, we will migrate to our own schema registry.

When starting or configuring a connector, Conduit needs to send it its gRPC
port.
The service's port will be random and Conduit will make it available to
connectors via an environment variable.

Internally, Conduit will use the Apicurio Registry to work with schemas. In future,
we plan to migrate to our own schema registry so that the tech stack is kept
simple and Conduit can be run with a single binary.
### Conduit Commons

### Connector SDK

For the needs of source connectors, the Connector SDK needs to provide the
following functions:
Conduit Commons needs to provide the following functions that will be used by
multiple libraries (Connector SDK, Processor SDK):

1. A function/builder that builds an Avro schema.
2. A function that registers a schema.
3. A function that encodes a value using the built schema.
1. A function that creates an Avro schema
2. A function that encodes values using an Avro schema
3. A function decodes a slice of bytes into a value, using an Avro schema

### Processor SDK

## How are requirements addressed

1. Records **should not** carry the whole schema.

Addressed by saving the schemas in a schema service and records keeping a
reference to the schema.
2. Sources and destinations need to be able to work with multiple schemas.
### Connector SDK

The schema service doesn't put any limitations on source collections.
3. A schema should be accessible across pipelines and Conduit instances.
The Connector SDK needs to provide the following functions:

Addressed through the usage of an external schema registry.
4. It should be possible for a schema to evolve.

There's no explicit support for schema versions, as they're not needed. A new
schema version can be created as a new schema object.
5. A source connector should be able to register a schema.
1. A function that registers a schema.
2. A function that fetches a schema.

The Connector SDK will provide a function to save a schema.
6. A destination connector should be able to fetch a specific schema.
### Processor SDK

The Connector SDK will provide a function to fetch a schema.
7. A destination connector needs a way to know that a schema changed.
The Processor SDK needs to provide the following functions:

A destination connector can compare the IDs or the records it received.
8. The Connector SDK should provide an API to work with the schemas.
9. The Connector SDK should cache the schemas.
1. A function that registers a schema.
2. A function that fetches a schema.

## Summary

The following design is proposed:

Conduit exposes a gRPC service for managing schemas. When Conduit starts a
connector, it sends the connector its gRPC port. A connector creates and fetches
schemas through the gRPC service.
Records will reference schemas using IDs. All schemas will be in the Avro
format. In the future, we might add support for other formats too.

Connectors and processors will access the schemas through a gRPC service exposed
by Conduit. When Conduit starts a connector, it publishes the port through an
environment variable. A connector creates and fetches schemas through the
service.

Conduit's gRPC service is an abstraction/indirection for an external schema
registry (such as Apicurio Registry), that is accessed by multiple Conduit
registry (Apicurio Registry), that is accessed by multiple Conduit
instances.

## Other considerations
Expand Down

0 comments on commit 68479f6

Please sign in to comment.