Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store the image moderation and text moderation logs #3478

Open
wants to merge 43 commits into
base: operation-202407
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
92a6d1f
Chatbot Arena Category Classification Script (#3433)
CodingWithTim Jul 9, 2024
f67b6e4
Update sglang_worker.py to fix #3372 error launching SGLand worker (#…
vikrantrathore Jul 29, 2024
653f7c1
add a performant worker dash-infer which is specifically optimized fo…
yejunjin Jul 29, 2024
d32e370
Update README.md
infwinston Jul 30, 2024
1c95cc8
Update monitor.py
infwinston Jul 30, 2024
d2016dd
Update README.md
infwinston Jul 31, 2024
925fb82
Update README.md
infwinston Jul 31, 2024
d310369
added leaderboard for arena hard auto (#3437)
connorchenn Jul 31, 2024
76571d2
Arena hard auto leaderboard UI (#3457)
CodingWithTim Jul 31, 2024
a5c29e1
Update monitor.py (#3460)
infwinston Aug 1, 2024
5c0443e
Survey only (#3466)
lisadunlap Aug 6, 2024
cb4da0d
Store text and image moderation logs
BabyChouSr Aug 15, 2024
605add3
Update moderation
BabyChouSr Aug 16, 2024
4492299
Run formatter
BabyChouSr Aug 16, 2024
2723660
Show vote button
BabyChouSr Aug 16, 2024
51f9a0d
Fix pylint
BabyChouSr Aug 16, 2024
38a1360
Fix pylint
BabyChouSr Aug 16, 2024
e10d11b
Save bad images
BabyChouSr Aug 16, 2024
5159d3b
Address comments
BabyChouSr Aug 17, 2024
8708fd7
Add max-model-len argument to vllm worker (#3451)
aliasaria Aug 18, 2024
29fc8a0
Revert "Add max-model-len argument to vllm worker" (#3488)
vikrantrathore Aug 21, 2024
c7f9230
New leaderboard (#3465)
lisadunlap Aug 22, 2024
d8f411a
Update dataset_release.md (#3492)
merrymercy Aug 24, 2024
4c25b00
Update link
infwinston Aug 26, 2024
36f7807
Update monitor_md.py (#3494)
infwinston Aug 26, 2024
0b09cee
Version names for Command R/R+ (#3491)
sanderland Aug 26, 2024
282534b
Update preset images (#3493)
lisadunlap Aug 26, 2024
05b9305
Add Style Control to Chatbot Arena Leaderboard 🔥 (#3495)
CodingWithTim Aug 27, 2024
dba425f
Save moderation info per turn
BabyChouSr Aug 27, 2024
d289be9
Change states
BabyChouSr Aug 27, 2024
7911ecd
Clean up
BabyChouSr Aug 27, 2024
1527aac
Get rid of previous moderation response
BabyChouSr Aug 27, 2024
36c67da
Rename
BabyChouSr Aug 27, 2024
4e62d77
Added NewYorker images back in (#3499)
lisadunlap Aug 27, 2024
3e21ddc
Fix Style control Bootstrapping (#3500)
CodingWithTim Aug 28, 2024
93037a4
Add load test (#3496)
BabyChouSr Aug 30, 2024
8714da2
Fix load test (#3508)
BabyChouSr Aug 31, 2024
b11f710
Merge branch 'main' into moderation-log
BabyChouSr Aug 31, 2024
571f39e
Merge branch 'main' into moderation-log
BabyChouSr Aug 31, 2024
3555d01
Merge remote-tracking branch 'fastchat/operation-202407' into moderat…
BabyChouSr Aug 31, 2024
fe45c6f
Format
BabyChouSr Aug 31, 2024
a2200e4
Merge with unified vision arena
BabyChouSr Aug 31, 2024
c90b8fc
Fix edge case
BabyChouSr Aug 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# FastChat
| [**Demo**](https://chat.lmsys.org/) | [**Discord**](https://discord.gg/HSWAKCrnFx) | [**X**](https://x.com/lmsysorg) |
| [**Demo**](https://lmarena.ai/) | [**Discord**](https://discord.gg/HSWAKCrnFx) | [**X**](https://x.com/lmsysorg) |

FastChat is an open platform for training, serving, and evaluating large language model based chatbots.
- FastChat powers Chatbot Arena (https://chat.lmsys.org/), serving over 10 million chat requests for 70+ LLMs.
- Chatbot Arena has collected over 500K human votes from side-by-side LLM battles to compile an online [LLM Elo leaderboard](https://leaderboard.lmsys.org).
- FastChat powers Chatbot Arena ([lmarena.ai](https://lmarena.ai)), serving over 10 million chat requests for 70+ LLMs.
- Chatbot Arena has collected over 1.5M human votes from side-by-side LLM battles to compile an online [LLM Elo leaderboard](https://lmarena.ai/?leaderboard).

FastChat's core features include:
- The training and evaluation code for state-of-the-art models (e.g., Vicuna, MT-Bench).
Expand All @@ -26,7 +26,7 @@ FastChat's core features include:

</details>

<a href="https://chat.lmsys.org"><img src="assets/demo_narrow.gif" width="70%"></a>
<a href="https://lmarena.ai"><img src="assets/demo_narrow.gif" width="70%"></a>

## Contents
- [Install](#install)
Expand Down Expand Up @@ -97,7 +97,7 @@ You can use the commands below to chat with them. They will automatically downlo

## Inference with Command Line Interface

<a href="https://chat.lmsys.org"><img src="assets/screenshot_cli.png" width="70%"></a>
<a href="https://lmarena.ai"><img src="assets/screenshot_cli.png" width="70%"></a>

(Experimental Feature: You can specify `--style rich` to enable rich text output and better text streaming quality for some non-ASCII content. This may not work properly on certain terminals.)

Expand Down Expand Up @@ -202,7 +202,7 @@ export FASTCHAT_USE_MODELSCOPE=True

## Serving with Web GUI

<a href="https://chat.lmsys.org"><img src="assets/screenshot_gui.png" width="70%"></a>
<a href="https://lmarena.ai"><img src="assets/screenshot_gui.png" width="70%"></a>

To serve using the web UI, you need three main components: web servers that interface with users, model workers that host one or more models, and a controller to coordinate the webserver and model workers. You can learn more about the architecture [here](docs/server_arch.md).

Expand Down
2 changes: 1 addition & 1 deletion docs/arena.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Chatbot Arena
Chatbot Arena is an LLM benchmark platform featuring anonymous, randomized battles, available at https://chat.lmsys.org.
Chatbot Arena is an LLM benchmark platform featuring anonymous, randomized battles, available at https://lmarena.ai.
We invite the entire community to join this benchmarking effort by contributing your votes and models.

## How to add a new model
Expand Down
23 changes: 23 additions & 0 deletions docs/dashinfer_integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# dash-infer Integration
[DashInfer](https://github.com/modelscope/dash-infer) is a high-performance inference engine specifically optimized for CPU environments, delivering exceptional performance boosts for LLM inference tasks. It supports acceleration for a variety of models including Llama, Qwen, and ChatGLM, making it a versatile choice as a performant worker in FastChat. Notably, DashInfer exhibits significant performance enhancements on both Intel x64 and ARMv9 processors, catering to a wide spectrum of hardware platforms. Its efficient design and optimization techniques ensure rapid and accurate inference capabilities, making it an ideal solution for deploying large language models in resource-constrained environments or scenarios where CPU utilization is preferred over GPU acceleration.

## Instructions
1. Install dash-infer.
```
pip install dashinfer
```

2. When you launch a model worker, replace the normal worker (`fastchat.serve.model_worker`) with the dash-infer worker (`fastchat.serve.dashinfer_worker`). All other commands such as controller, gradio web server, and OpenAI API server are kept the same.
```
python3 -m fastchat.serve.dashinfer_worker --model-path qwen/Qwen-7B-Chat --revision=master /path/to/dashinfer-model-generation-config.json
```
Here is an example:
```
python3 -m fastchat.serve.dashinfer_worker --model-path qwen/Qwen-7B-Chat --revision=master dash-infer/examples/python/model_config/config_qwen_v10_7b.json
```

If you use an already downloaded model, try to replace model-path with a local one and choose a conversation template via --conv-template option
'''
python3 -m fastchat.serve.dashinfer_worker --model-path ~/.cache/modelscope/hub/qwen/Qwen-7B-Chat --conv-template qwen-7b-chat /path/to/dashinfer-model-generation-config.json
'''
All avaliable conversation chat templates are listed at [fastchat/conversation.py](../fastchat/conversation.py)
1 change: 1 addition & 0 deletions docs/dataset_release.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@
We release the following datasets based on our projects and websites.

- [LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
- [LMSYS-Human-Preference-55k](lmsys/lmsys-arena-human-preference-55k)
- [Chatbot Arena Conversation Dataset](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations)
- [MT-bench Human Annotation Dataset](https://huggingface.co/datasets/lmsys/mt_bench_human_judgments)
9 changes: 8 additions & 1 deletion fastchat/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@

REPO_PATH = os.path.dirname(os.path.dirname(__file__))

# Survey Link URL (to be removed)
SURVEY_LINK = """<div style='text-align: center; margin: 20px 0;'>
<div style='display: inline-block; border: 2px solid #DE3163; padding: 10px; border-radius: 5px;'>
<span style='color: #DE3163; font-weight: bold;'>We would love your feedback! Fill out <a href='https://docs.google.com/forms/d/e/1FAIpQLSfKSxwFOW6qD05phh4fwYjk8q0YV1VQe_bmK0_qOVTbC66_MA/viewform?usp=sf_link' style='color: #DE3163; text-decoration: underline;'>this short survey</a> to tell us what you like about the arena, what you don't like, and what you want to see in the future.</span>
</div>
</div>"""

##### For the gradio web server
SERVER_ERROR_MSG = (
"**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**"
Expand All @@ -21,7 +28,7 @@
CONVERSATION_LIMIT_MSG = "YOU HAVE REACHED THE CONVERSATION LENGTH LIMIT. PLEASE CLEAR HISTORY AND START A NEW CONVERSATION."
INACTIVE_MSG = "THIS SESSION HAS BEEN INACTIVE FOR TOO LONG. PLEASE REFRESH THIS PAGE."
SLOW_MODEL_MSG = "⚠️ Both models will show the responses all at once. Please stay patient as it may take over 30 seconds."
RATE_LIMIT_MSG = "**RATE LIMIT OF THIS MODEL IS REACHED. PLEASE COME BACK LATER OR USE <span style='color: red; font-weight: bold;'>[BATTLE MODE](https://chat.lmsys.org)</span> (the 1st tab).**"
RATE_LIMIT_MSG = "**RATE LIMIT OF THIS MODEL IS REACHED. PLEASE COME BACK LATER OR USE <span style='color: red; font-weight: bold;'>[BATTLE MODE](https://lmarena.ai)</span> (the 1st tab).**"
# Maximum input length
INPUT_CHAR_LEN_LIMIT = int(os.getenv("FASTCHAT_INPUT_CHAR_LEN_LIMIT", 12000))
BLIND_MODE_INPUT_CHAR_LEN_LIMIT = int(
Expand Down
6 changes: 5 additions & 1 deletion fastchat/conversation.py
Original file line number Diff line number Diff line change
Expand Up @@ -582,7 +582,11 @@ def save_new_images(self, has_csam_images=False, use_remote_storage=False):
from fastchat.utils import load_image, upload_image_file_to_gcs
from PIL import Image

_, last_user_message = self.messages[-2]
last_user_message = None
for role, message in reversed(self.messages):
if role == "user":
last_user_message = message
break

if type(last_user_message) == tuple:
text, images = last_user_message[0], last_user_message[1]
Expand Down
6 changes: 3 additions & 3 deletions fastchat/llm_judge/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ To make sure FastChat loads the correct prompt template, see the supported model

You can also specify `--num-gpus-per-model` for model parallelism (needed for large 65B models) and `--num-gpus-total` to parallelize answer generation with multiple GPUs.

> Note: if you experience slow answer generation, please refer to [Other Backends](#other-backends) section to use inference engine to speed up by 20x.

#### Step 2. Generate GPT-4 judgments
There are several options to use GPT-4 as a judge, such as pairwise winrate and single-answer grading.
In MT-bench, we recommend single-answer grading as the default mode.
Expand Down Expand Up @@ -134,9 +136,7 @@ We can also use vLLM for answer generation, which can be faster for the models s

1. Launch a vLLM worker
```
python3 -m fastchat.serve.controller
python3 -m fastchat.serve.vllm_worker --model-path [MODEL-PATH]
python3 -m fastchat.serve.openai_api_server --host localhost --port 8000
vllm serve [MODEL-PATH] --dtype auto
```
- Arguments:
- `[MODEL-PATH]` is the path to the weights, which can be a local folder or a Hugging Face repo ID.
Expand Down
2 changes: 1 addition & 1 deletion fastchat/model/model_adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -2423,7 +2423,7 @@ def get_default_conv_template(self, model_path: str) -> Conversation:


class DBRXAdapter(BaseModelAdapter):
"""The model adapter for Cohere"""
"""The model adapter for Databricks"""

def match(self, model_path: str):
return model_path in ["dbrx-instruct"]
Expand Down
12 changes: 6 additions & 6 deletions fastchat/model/model_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,17 +195,17 @@ def get_model_info(name: str) -> ModelInfo:
)

register_model_info(
["command-r-plus"],
"Command-R-Plus",
["command-r-plus", "command-r-plus-04-2024"],
"Command R+",
"https://txt.cohere.com/command-r-plus-microsoft-azure/",
"Command-R Plus by Cohere",
"Command R+ by Cohere",
)

register_model_info(
["command-r"],
"Command-R",
["command-r", "command-r-03-2024", "command-r-08-2024"],
"Command R",
"https://txt.cohere.com/command-r/",
"Command-R by Cohere",
"Command R by Cohere",
)

register_model_info(
Expand Down
Loading
Loading