Skip to content

Conduit streams data between data stores. Kafka Connect replacement. No JVM required.

License

Notifications You must be signed in to change notification settings

ConduitIO/conduit

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

d4dd111 · Jul 11, 2022
Jul 5, 2022
Mar 31, 2022
Jul 7, 2022
Jan 11, 2022
Jul 11, 2022
Jun 27, 2022
Jun 24, 2022
Mar 15, 2022
Jul 5, 2022
Mar 15, 2022
Jan 11, 2022
Jun 17, 2022
Mar 8, 2022
Jan 11, 2022
Apr 6, 2022
Jan 11, 2022
Jun 24, 2022
Apr 11, 2022
Jan 19, 2022
Mar 15, 2022
Jan 11, 2022
Jul 6, 2022
Jun 22, 2022
Jan 11, 2022

Repository files navigation

Conduit

Logo

Data Integration for Production Data Stores. 💫

License Build Go Report Card Discord Go Reference Conduit docs API docs

Overview

Conduit is a data streaming tool written in Go. It aims to provide the best user experience for building and running real-time data pipelines. Conduit comes with batteries included, it provides a UI, common connectors, transforms and observability data out of the box.

Conduit pipelines are built out of simple building blocks which run in their own goroutines and are connected using Go channels. This makes Conduit pipelines incredibly performant on multi-core machines. Conduit guarantees the order of received records won't change, it also takes care of consistency by propagating acknowledgments to the start of the pipeline only when a record is successfully processed on all destinations.

Conduit connectors are plugins that communicate with Conduit via a gRPC interface. This means that plugins can be written in any language as long as they conform to the required interface.

Conduit was created and open-sourced by Meroxa.

Installation guide

Download and run release

Download a pre-built binary from the latest release and simply run it!

./conduit

Once you see that the service is running you may access a user-friendly web interface at http://localhost:8080/ui/. You can also interact with the Conduit API directly, we recommend navigating to http://localhost:8080/openapi/ and exploring the HTTP API through Swagger UI.

Conduit can be configured through command line parameters. To view the full list of available options, run ./conduit --help.

Build from source

Requirements:

git clone [email protected]:ConduitIO/conduit.git
cd conduit
make
./conduit

Note that you can also build Conduit with make build-server, which only compiles the server and skips the UI. This command requires only Go and builds the binary much faster. That makes it useful for development purposes or for running Conduit as a simple backend service.

Docker

Our Docker images are hosted on GitHub's Container Registry. To run the latest Conduit version, you should run the following command:

docker run -p 8080:8080 ghcr.io/conduitio/conduit:latest

The Docker image includes the UI, you can access it by navigating to http://localhost:8080/ui.

Connectors

Conduit ships with a number of built-in connectors:

  • File connector provides a source/destination to read/write a local file (useful for quickly trying out Conduit without additional setup).
  • Kafka connector provides a source/destination for Apache Kafka.
  • Postgres connector provides a source/destination for PostgreSQL.
  • S3 connector provides a source/destination for AWS S3.
  • Generator connector provides a source which generates random data (useful for testing).

Additionally, we have prepared a Kafka Connect wrapper that allows you to run any Apache Kafka Connect connector as part of a Conduit pipeline.

Conduit is also able to run standalone connectors. If you are interested in writing a connector yourself, have a look at our Go Connector SDK. Since standalone connectors communicate with Conduit through gRPC they can be written in virtually any programming language, as long as the connector follows the Conduit Connector Protocol.

Testing

Conduit tests are split in two categories: unit tests and integration tests. Unit tests can be run without any additional setup while integration tests require additional services to be running (e.g. Kafka or Postgres).

Unit tests can be run with make test.

Integration tests require Docker to be installed and running, they can be run with make test-integration. This command will handle starting and stopping docker containers for you.

API

Conduit exposes a gRPC API and an HTTP API.

The gRPC API is by default running on port 8084. You can define a custom address using the CLI flag -grpc.address. To learn more about the gRPC API please have a look at the protobuf file.

The HTTP API is by default running on port 8080. You can define a custom address using the CLI flag -http.address. It is generated using gRPC gateway and is thus providing the same functionality as the gRPC API. To learn more about the HTTP API please have a look at the API documentation, OpenAPI definition or run Conduit and navigate to http://localhost:8080/openapi/ to open a Swagger UI which makes it easy to try it out.

UI

Conduit comes with a web UI that makes building data pipelines a breeze, you can access it at http://localhost:8080/ui/. See the installation guide for instructions on how to build Conduit with the UI.

For more information about the UI refer to the Readme in /ui.

animation

Documentation

To learn more about how to use Conduit visit docs.conduit.io.

If you are interested in internals of Conduit we have prepared some technical documentation:

Known limitations

Conduit is currently in a pre-1.0 state. While Conduit is built on strong foundations and experiences from running similar systems, it's not production ready at the moment. Following features are on the roadmap and yet to be implemented. These features will change the behavior of the systems:

  1. Standard record format - we plan to have the records implement a single standard for CDC events. See PR.
  2. Delivery and ordering guarantees - from the experience we have so far, messages created internally are reliably delivered through Conduit (from source nodes, over processing nodes to destination nodes). However, we still need good end-to-end, full-scale tests to actually prove that.
  3. Performance guarantees (for the core) - reasons are identical to reasons for delivery guarantees.
  4. Dynamic loading of list of plugins - currently, the API cannot return the list of all available plugins and the available configuration parameters. Consequently, the UI has the plugin paths and configuration parameters hard-coded.

Contributing

For a complete guide to contributing to Conduit, see the Contribution Guide.

We welcome you to join the community and contribute to Conduit to make it better! When something does not work as intended please check if there is already an issue that describes your problem, otherwise please open an issue and let us know. When you are not sure how to do something please open a discussion or hit us up on Discord.

We also value contributions in form of pull requests. When opening a PR please ensure:

  • You have followed the Code Guidelines.
  • There is no other pull request for the same update/change.
  • You have written unit tests.
  • You have made sure that the PR is of reasonable size and can be easily reviewed.