GitHub - FlintSH/Disclone: Fine-tune OpenAI models with your Discord chat history

Banner generated with FLUX.1-Pro

Disclone - Fine-tune OpenAI models with your Discord chat history

As of June 2nd, 2025, this script no longer works, will fix.

Disclone is a little script I made that helps you create a fine-tuning dataset for OpenAI models based on Discord chat logs.

It allows you to generate an AI version of a specific Discord user by processing their chat messages into a JSONL file that can be plugged into OpenAI's fine-tuning API.

What it does

Processes a CSV file exported from Discord using DiscordChatExporter.
Creates training data from the chat logs, focusing on a specific user.
Applies content moderation to filter out inappropriate content - this is a must as OpenAI's fine-tuning API will reject datasets with a high density of inappropriate messages.

Content is moderated using the same API that OpenAI uses to moderate content on their platform, so it essentially guarantees that your dataset will be accepted.

Generates a JSONL file that can be simply plugged into OpenAI's fine-tuning API, and have a new model tuned on your data.

Prerequisites

Python 3.7 or higher
An OpenAI API key
A CSV file of your desired Discord chat exported using DiscordChatExporter.

Installation

Clone this repository:

git clone https://github.com/FlintSH/disclone.git
cd disclone

Install the required packages:
```
pip install -r requirements.txt
```
Set up your OpenAI API key as an environment variable:
```
export OPENAI_API_KEY='your-api-key-here'
```

Note: While this uses an OpenAI API key, this is just for the moderation API, and is not used to generate any content. OpenAI's moderation API is free to use, and your API key will not be charged.

Usage

Export your Discord chat logs:
- Use DiscordChatExporter to export a chat as a CSV file.
Run the Disclone script:
```
python main.py
```
Follow the prompts:
- Enter the path to your CSV file.
- Specify the Discord username of the target user you want to clone.
- Provide a system prompt to guide the AI's behavior (optional but recommended).
- Enter a start date if you want to process messages from a specific date onwards (optional).
- Set a limit on the number of conversations to include (optional).
Wait for the script to process the data. It will:
- Parse the CSV file
- Create conversations
- Moderate content
- Generate a JSONL file for fine-tuning
Once complete, you'll find a JSONL file named <username>_training_data.jsonl in the same directory.
Use this JSONL file to fine-tune an OpenAI model following the OpenAI fine-tuning guide.

Notes

The script uses OpenAI's content moderation API to filter out inappropriate content. This helps ensure the training data is suitable for fine-tuning.
The generated JSONL file follows OpenAI's required format for fine-tuning datasets.
Be mindful of Discord's terms of service and privacy considerations when exporting and using chat data - use at your own risk.

Troubleshooting

If you encounter rate limiting issues, you may need to adjust the RateLimiter parameters in the script based on your OpenAI API tier.
Ensure your CSV file is properly formatted and contains the required columns (ID, Author, Date, Content, Attachments, Reactions).

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Disclone - Fine-tune OpenAI models with your Discord chat history

As of June 2nd, 2025, this script no longer works, will fix.

What it does

Prerequisites

Installation

Usage

Notes

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

FlintSH/Disclone

Folders and files

Latest commit

History

Repository files navigation

Disclone - Fine-tune OpenAI models with your Discord chat history

As of June 2nd, 2025, this script no longer works, will fix.

What it does

Prerequisites

Installation

Usage

Notes

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages