Skip to content

Add Chonkie Integration Information to Chroma Docs #4619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ We welcome pull requests to add new Integrations to the community.

| | Python | JS |
| --------------------------------------- | ------ | ------------ |
| [Chonkie](./frameworks/chonkie) | ✓ | Coming Soon! |
| [DeepEval](./frameworks/deepeval) | ✓ | - |
| [Langchain](./frameworks/langchain) | ✓ | ✓ |
| [LlamaIndex](./frameworks/llamaindex) | ✓ | ✓ |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
id: chonkie
name: Chonkie
---

# Chonkie
[Chonkie](https://docs.chonkie.ai) is an open source library for data ingestion.
Chonkie provides advanced chunkers for cleanly splitting your data and embedding handlers to embed the resulting chunks.

When building a RAG system, you can use Chonkie to chunk and embed your data, and then store the chunks in Chroma using the Chroma handshake.

{% Banner type="tip" %}
For more information on how to use Chonkie, see the [Chonkie Docs](https://docs.chonkie.ai)
{% /Banner %}


# Chonkie Chroma Handshake

Chonkie provides a Chroma handshake that can be used to embed and insert chunks into a Chroma collection.

## Prerequisites

Install Chonkie with Chroma dependencies:
```bash
pip install "chonkie[chroma]"
```

## Usage

### Chunking With Chonkie
```python
from chonkie import SemanticChunker

text = "Chonkie and Chroma - Best Friends For Life!"
chunker = SemanticChunker() # See docs.chonkie.ai for more information on chunkers

# Chunk your data
chunks = chunker(text)
```

### Initialize Chroma Handshake
```python
from chonkie import ChromaHandshake

# Initialize with default settings (in-memory ChromaDB)
handshake = ChromaHandshake()

# Or specify a persistent storage path
handshake = ChromaHandshake(path="./chroma_db")

# Or use an existing Chroma client
import chromadb
client = chromadb.Client()
handshake = ChromaHandshake(client=client, collection_name="my_collection")

# Feature: Select embedding model to use
handshake = ChromaHandshake(embedding_model="text-embedding-ada-002")
```

### Writing Chunks to Chroma
```python
from chonkie import ChromaHandshake, SemanticChunker

handshake = ChromaHandshake() # Initializes a new Chroma client

text = "Chonkie and Chroma - Best Friends For Life!"
chunker = SemanticChunker()
chunks = chunker(text)

handshake.write(chunks)
```

## Resources

- [Chonkie Documentation](https://docs.chonkie.ai)
- [Chonkie Chroma Handshake Documentation](https://docs.chonkie.ai/python-sdk/handshakes/chroma-handshake)
- [Chonkie Discord](https://discord.gg/6V5pqvqsCY)