ArkFlow

English | 中文

‼️Not production-ready, do not use in a production environment‼️

High-performance Rust stream processing engine, providing powerful data stream processing capabilities, supporting multiple input/output sources and processors.

Features

High Performance: Built on Rust and Tokio async runtime, offering excellent performance and low latency
Multiple Data Sources: Support for Kafka, MQTT, HTTP, files, and other input/output sources
Powerful Processing Capabilities: Built-in SQL queries, JSON processing, Protobuf encoding/decoding, batch processing, and other processors
Extensible: Modular design, easy to extend with new input, output, and processor components

Installation

Building from Source

# Clone the repository
git clone https://github.com/chenquan/arkflow.git
cd arkflow

# Build the project
cargo build --release

# Run tests
cargo test

Quick Start

Create a configuration file config.yaml:

logging:
  level: info
streams:
  - input:
      type: "generate"
      context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
      interval: 1s
      batch_size: 10

    pipeline:
      thread_num: 4
      processors:
        - type: "json_to_arrow"
        - type: "sql"
          query: "SELECT * FROM flow WHERE value >= 10"
        - type: "arrow_to_json"

    output:
      type: "stdout"

Run ArkFlow:

./target/release/arkflow --config config.yaml

Configuration Guide

ArkFlow uses YAML format configuration files, supporting the following main configuration items:

Top-level Configuration

logging:
  level: info  # Log level: debug, info, warn, error

streams:       # Stream definition list
  - input:      # Input configuration
      # ...
    pipeline:   # Processing pipeline configuration
      # ...
    output:     # Output configuration
      # ...

Input Components

ArkFlow supports multiple input sources:

Kafka: Read data from Kafka topics
MQTT: Subscribe to messages from MQTT topics
HTTP: Receive data via HTTP
File: Read data from files
Generator: Generate test data
SQL: Query data from databases

Example:

input:
  type: kafka
  brokers:
    - localhost:9092
  topics:
    - test-topic
  consumer_group: test-group
  client_id: arkflow
  start_from_latest: true

Processors

ArkFlow provides multiple data processors:

JSON: JSON data processing and transformation
SQL: Process data using SQL queries
Protobuf: Protobuf encoding/decoding
Batch Processing: Process messages in batches

Example:

pipeline:
  thread_num: 4
  processors:
    - type: json_to_arrow
    - type: sql
      query: "SELECT * FROM flow WHERE value >= 10"
    - type: arrow_to_json

Output Components

ArkFlow supports multiple output targets:

Kafka: Write data to Kafka topics
MQTT: Publish messages to MQTT topics
HTTP: Send data via HTTP
File: Write data to files
Standard Output: Output data to the console

Example:

output:
  type: kafka
  brokers:
    - localhost:9092
  topic: output-topic
  client_id: arkflow-producer

Examples

Kafka to Kafka Data Processing

streams:
  - input:
      type: kafka
      brokers:
        - localhost:9092
      topics:
        - test-topic
      consumer_group: test-group

    pipeline:
      thread_num: 4
      processors:
        - type: json_to_arrow
        - type: sql
          query: "SELECT * FROM flow WHERE value > 100"
        - type: arrow_to_json

    output:
      type: kafka
      brokers:
        - localhost:9092
      topic: processed-topic

Generate Test Data and Process

streams:
  - input:
      type: "generate"
      context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
      interval: 1ms
      batch_size: 10000

    pipeline:
      thread_num: 4
      processors:
        - type: "json_to_arrow"
        - type: "sql"
          query: "SELECT count(*) FROM flow WHERE value >= 10 group by sensor"
        - type: "arrow_to_json"

    output:
      type: "stdout"

License

ArkFlow is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github		.github
examples		examples
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArkFlow

Features

Installation

Building from Source

Quick Start

Configuration Guide

Top-level Configuration

Input Components

Processors

Output Components

Examples

Kafka to Kafka Data Processing

Generate Test Data and Process

License

About

Releases

Packages

Contributors 2

Languages

License

chenquan/arkflow

Folders and files

Latest commit

History

Repository files navigation

ArkFlow

Features

Installation

Building from Source

Quick Start

Configuration Guide

Top-level Configuration

Input Components

Processors

Output Components

Examples

Kafka to Kafka Data Processing

Generate Test Data and Process

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages