A Rust-based web agent with an embedded OpenAI-compatible inference server (supports Gemma models only). It is packaged and deployed as a container.
This project is organized as a Cargo workspace with the following crates:
crates
- agent-server: The main web agent server
- inference-engine: An embedded OpenAI-compatible inference server for Gemma models
packages
- genaiscript: GenaiScript scripts
- genaiscript-rust-shim: An embedded OpenAI-compatible inference server for Gemma models
Special gratitude and thanks:
- OpenAI: For standards that offer consensus in chaos.
- The Rust community for their excellent tools and libraries
- Google's Gemma team for gemma-3-1b-it
- GenAIScript: Automatable GenAI Scripting
- axum: Web framework for building APIs
- tokio: Asynchronous runtime for efficient concurrency
- serde: Serialization/deserialization framework
- rmcp: Model Context Protocol SDK for agent communication
- sled: Embedded database for persistent storage
- tower-http: HTTP middleware components
- candle-core: ML framework for efficient tensor operations
- candle-transformers: Implementation of transformer models in Candle
- hf-hub: Client for downloading models from Hugging Face
- tokenizers: Fast text tokenization for ML models
- safetensors: Secure format for storing tensors
%% High‑fidelity architecture diagram – client‑ready
flowchart LR
%% ─────────────── Agent‑side ───────────────
subgraph AGENT_SERVER["Agent Server"]
direction TB
AS["Agent Server"]:::core -->|exposes| MCP[["Model Context Protocol API"]]:::api
AS -->|serves| UI["MCP Inspector UI"]:::ui
subgraph AGENTS["Agents"]
direction TB
A_SEARCH["Search Agent"] -->|uses| SEARX
A_NEWS["News Agent"] -->|uses| SEARX
A_SCRAPE["Web Scrape Agent"] -->|uses| BROWSER
A_IMG["Image Generator Agent"]-->|uses| EXTERNAL_API
A_RESEARCH["Deep Research Agent"] -->|leverages| SEARX
end
%% Individual fan‑out lines (no “&”)
MCP -->|routes| A_SEARCH
MCP -->|routes| A_NEWS
MCP -->|routes| A_SCRAPE
MCP -->|routes| A_IMG
MCP -->|routes| A_RESEARCH
end
%% ─────────────── Local inference ───────────────
subgraph INFERENCE["Inference Engine"]
direction TB
LIE["Inference Engine"]:::core -->|loads| MODELS["Gemma Models"]:::model
LIE -->|exposes| OPENAI_API["OpenAI‑compatible API"]:::api
MODELS -->|runs on| ACCEL
subgraph ACCEL["Hardware Acceleration"]
direction LR
METAL[Metal]
CUDA[CUDA]
CPU[CPU]
end
end
%% ─────────────── External bits ───────────────
subgraph EXTERNAL["External Components"]
direction TB
SEARX["SearXNG Search"]:::ext
BROWSER["Chromium Browser"]:::ext
EXTERNAL_API["Public OpenAI API"]:::ext
end
%% ─────────────── Clients ───────────────
subgraph CLIENTS["Client Applications"]
CLIENT["MCP‑aware Apps"]:::client
end
%% ─────────────── Interactions ───────────────
CLIENT -- "HTTPS / WebSocket" --> MCP
AS --> |"may call"| OPENAI_API
AS --> |"optional"| EXTERNAL_API
%% ─────────────── Styling ───────────────
classDef core fill:#A9CEF4,stroke:#36494E,stroke-width:2px,color:#000;
classDef api fill:#7EA0B7,stroke:#36494E,stroke-width:2px,color:#000;
classDef ui fill:#A9CEF4,stroke:#597081,stroke-dasharray:4 3,color:#000;
classDef model fill:#A9CEF4,stroke:#36494E,stroke-width:2px,color:#000;
classDef ext fill:#B5D999,stroke:#36494E,stroke-width:2px,color:#000;
classDef client fill:#FFE69A,stroke:#36494E,stroke-width:2px,color:#000;
- Clone the repository
- Copy the example environment file:
cp .env.example .env
- Install JavaScript dependencies:
bun i
- Start the SearXNG search engine:
docker compose up -d searxng
To run the local inference engine:
cd crates/inference-engine
cargo run --release -- --server
To run the agent server:
cargo run -p agent-server
For development with automatic reloading:
bun dev
To build all crates in the workspace:
cargo build
To build a specific crate:
cargo build -p agent-server
# or
cargo build -p inference-engine