Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store the image moderation and text moderation logs #3478

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

BabyChouSr
Copy link
Collaborator

Right now, we don't store the text moderation and image moderation info when it can be very helpful.

@BabyChouSr BabyChouSr marked this pull request as ready for review August 16, 2024 05:14
Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick first pass. overall looks good to me!

@@ -0,0 +1,167 @@
import datetime
Copy link
Member

@infwinston infwinston Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super awesome to see we implement this abstraction!

fastchat/serve/gradio_block_arena_anony.py Outdated Show resolved Hide resolved
Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @BabyChouSr left some comments

fastchat/serve/gradio_block_arena_named.py Outdated Show resolved Hide resolved
def flash_buttons():
def flash_buttons(dont_show_vote_buttons: bool = False):
if dont_show_vote_buttons:
yield [no_change_btn] * 4 + [enable_btn] * 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actually ends up breaking the ui - need to keep yield + return pattern

fastchat/serve/gradio_block_arena_vision_anony.py Outdated Show resolved Hide resolved
Comment on lines 369 to 370
+ [disable_btn] * 4
+ [no_change_btn] * 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why x4 + x3 vs x7 before

fastchat/serve/gradio_block_arena_vision_named.py Outdated Show resolved Hide resolved
fastchat/serve/gradio_web_server.py Outdated Show resolved Hide resolved
@BabyChouSr BabyChouSr changed the base branch from operation-202407 to main October 6, 2024 03:36
Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BabyChouSr ! this is super awesome. took a pass and left some comments. I think the main discussion item is what data should we log? maybe we don't need to log the entire moderation output which will take lots of space

@@ -342,18 +352,45 @@ def bot_response_multi(
request: gr.Request,
):
logger.info(f"bot_response_multi (anony). ip: {get_ip(request)}")
states = [state0, state1]

if states[0] is None or states[0].skip_next:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can use this variable

states[0].content_moderator.text_flag

@@ -151,6 +154,7 @@ def dict(self):
{
"conv_id": self.conv_id,
"model_name": self.model_name,
"moderation": self.content_moderator.conv_moderation_responses,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried this would make our logs too huge... do we need to store the complete result or just moderation: True | False

the log increased probably 3x larger from

{"tstamp": 1728682280.9245, "type": "chat", "model": "chatgpt-4o-latest", "gen_params": {"temperature": 0.7, "top_p": 0.7, "max_new_tokens": 1024}, "start": 1728682277.5957, "finish": 1728682280.9245, "state": {"template_name": "gpt-4-turbo-2024-04-09", "system_message": "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.\nKnowledge cutoff: 2023-11\nCurrent date: 2024-10-11\n\nImage input capabilities: Enabled\nPersonality: v2", "roles": ["user", "assistant"], "messages": [["user", "heyy"], ["assistant", "Hey! How\u2019s it going?"]], "offset": 0, "conv_id": "79aac635c1924185ad5e2d2c07d626a7", "model_name": "chatgpt-4o-latest"}

to

{"tstamp": 1728682280.9245, "type": "chat", "model": "chatgpt-4o-latest", "gen_params": {"temperature": 0.7, "top_p": 0.7, "max_new_tokens": 1024}, "start": 1728682277.5957, "finish": 1728682280.9245, "state": {"template_name": "gpt-4-turbo-2024-04-09", "system_message": "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.\nKnowledge cutoff: 2023-11\nCurrent date: 2024-10-11\n\nImage input capabilities: Enabled\nPersonality: v2", "roles": ["user", "assistant"], "messages": [["user", "heyy"], ["assistant", "Hey! How\u2019s it going?"]], "offset": 0, "conv_id": "79aac635c1924185ad5e2d2c07d626a7", "model_name": "chatgpt-4o-latest", "moderation": [{"text_moderation": {"response": {"harassment": 1.1623426871665288e-05, "harassment_threatening": 4.7560683924530167e-07, "hate": 3.7377474200184224e-06, "hate_threatening": 3.018905303520114e-08, "illicit": null, "illicit_violent": null, "self_harm": 5.3270043281372637e-05, "self_harm_instructions": 3.869533247780055e-05, "self_harm_intent": 9.72322523011826e-05, "sexual": 0.0004021718923468143, "sexual_minors": 3.3023070500348695e-06, "violence": 1.4770392908758367e-06, "violence_graphic": 2.4935059173003538e-06, "self-harm": 5.3270043281372637e-05, "sexual/minors": 3.3023070500348695e-06, "hate/threatening": 3.018905303520114e-08, "violence/graphic": 2.4935059173003538e-06, "self-harm/intent": 9.72322523011826e-05, "self-harm/instructions": 3.869533247780055e-05, "harassment/threatening": 4.7560683924530167e-07}, "flagged": false}, "nsfw_moderation": {"flagged": false}, "csam_moderation": {"flagged": false}}, {"text_moderation": {"response": {"harassment": 3.410908902878873e-05, "harassment_threatening": 3.6979138258175226e-06, "hate": 2.1581441615126096e-05, "hate_threatening": 3.484581512225304e-08, "illicit": null, "illicit_violent": null, "self_harm": 4.098457338841399e-06, "self_harm_instructions": 5.236282163423311e-07, "self_harm_intent": 9.65906565397745e-07, "sexual": 0.0002804335963446647, "sexual_minors": 7.323227464439697e-07, "violence": 5.0677666877163574e-05, "violence_graphic": 9.602602403901983e-06, "self-harm": 4.098457338841399e-06, "sexual/minors": 7.323227464439697e-07, "hate/threatening": 3.484581512225304e-08, "violence/graphic": 9.602602403901983e-06, "self-harm/intent": 9.65906565397745e-07, "self-harm/instructions": 5.236282163423311e-07, "harassment/threatening": 3.6979138258175226e-06}, "flagged": false}, "nsfw_moderation": {"flagged": false}, "csam_moderation": {"flagged": false}}]}, "ip": "76.102.1.74"}

Comment on lines +94 to +105
"gpt-4o-2024-05-13": 4,
"gpt-4-turbo-2024-04-09": 4,
"claude-3-haiku-20240307": 4,
"claude-3-sonnet-20240229": 4,
"claude-3-5-sonnet-20240620": 4,
"claude-3-opus-20240229": 4,
"gemini-1.5-flash-api-0514": 4,
"gemini-1.5-pro-api-0514": 4,
"llava-v1.6-34b": 4,
"reka-core-20240501": 4,
"reka-flash-preview-20240611": 4,
"reka-flash": 4,
Copy link
Member

@infwinston infwinston Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove them and keep the dict empty? (people may not have access to these models)

@@ -301,14 +346,19 @@ def bot_response_multi(
break


def flash_buttons():
def flash_buttons(show_vote_buttons: bool = True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry could you say more what's this for?

@@ -175,19 +178,27 @@ def add_text(
no_change_btn,
]
* 6
+ [True]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use a variable say show_vote_button? I see below we have a few more [True] [False]

@@ -884,6 +884,7 @@ def build_leaderboard_tab(
elo_results_file,
leaderboard_table_file,
arena_hard_leaderboard,
vision=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

top_p,
max_new_tokens,
request,
)
get_remote_logger().log(data)
Copy link
Member

@infwinston infwinston Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might cause bug for

get_remote_logger().log(data)

? as data is not constructed anymore

)

logger = build_logger("gradio_web_server_multi", "gradio_web_server_multi.log")

num_sides = 2
enable_moderation = False
use_remote_storage = False
Copy link
Member

@infwinston infwinston Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this be global variable? also should it be globally False?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants