Error following the huggingface hosting instructions. #8

derekalia · 2025-01-22T23:58:23Z

Followed the provided notion doc for deploying on HF -> https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71

I used the 72B-DPO model for deployment -> https://huggingface.co/bytedance-research/UI-TARS-72B-DPO

Added all the correct configurations but always got back an error:

raise self._make_status_error_from_response(err.response) from None
openai.UnprocessableEntityError: Error code: 422 - {'error': 'Input validation error: inputs tokens + max_new_tokens must be <= 32768. Given: 880370 inputs tokens and 0 max_new_tokens', 'error_type': 'validation'}

Basically the image size is too large and I need to be under the 32k limit.

I also used the resize_image function to reduce the image size but this didnt really effect my image but due to its already small size (1200px by 800px)

I also tried using one of the provided images in the hosted space on HF -> https://huggingface.co/spaces/Aheader/gui_test_app/blob/main/examples/solitaire.png

This is even larger than my image and it didnt work.

Any ideas why its not working with your provided setup?

The text was updated successfully, but these errors were encountered:

AHEADer · 2025-01-23T00:03:42Z

you may have too many images in your prompt. I suggest just keep at most 5 images with all the text as the input

derekalia · 2025-01-23T00:44:03Z

I'm sending 1 image.

Also I just tired to deploy the 7b model and that worked. Looks like there is an issue with the 72B model hosted on HF.

AHEADer · 2025-01-23T00:49:46Z

can you paste your request script here so i can try to reproduce and fix

derekalia · 2025-01-23T01:04:59Z

I'm just running the code that was provided in the example doc.

Also can I ask what prompt are you giving your LLM? Your space gets good results and everytime I run locally or on HF the results are not that good.

from openai import OpenAI


instruction = "find the search bar"
screenshot_path = "imgs/shoot.png"


client = OpenAI(base_url="my hf enpoind here", api_key="my key gere")
model = "tgi"

prompt = "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nYou are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. \n\n    ## Output Format\n    ```\n    Action_Summary: ...\n    Action: ...\n    ```\n\n    ## Action Space\n    click(start_box=‘<|box_start|>(x1,y1)<|box_end|>’)\nlong_press(start_box=‘<|box_start|>(x1,y1)<|box_end|>’, time=‘’)\ntype(content=‘’)\nscroll(direction=‘down or up or right or left’)\nopen_app(app_name=‘’)\nnavigate_back()\nnavigate_home()\nWAIT()\nfinished() # Submit the task regardless of whether it succeeds or fails.\n\n    ## Note\n    - Use English in `Action_Summary` part.\n    \n\n    ## User Instruction\n"

with open(screenshot_path, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode("utf-8")

response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": prompt + instruction},
                {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{encoded_string}"}},

            ],
        },
    ],
)
print(response.choices[0].message.content)

AHEADer · 2025-01-23T02:25:27Z

Hi,
My space prompt is Output only the coordinate of one box in your response. and it's just for grounding. I think maybe you can remove this part <|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n from your prompt, since this part will auto generated by the chat_template for the cloud platform. I've updated our README.
By the way, I cannot find any quota from huggingface cloud to deploy a 72B model, I'll get back to you soon once I have quota.

AHEADer · 2025-01-24T02:46:05Z

Hi,

We have solved this problem by updating the model's config. We use the config from old version of transformers. It seems that TGI and Inference Endpoints have some bugs reading the config from the newest transformers.

AHEADer mentioned this issue Jan 24, 2025

Deployed successfully on huggingface, but could not be invoked #13

Open

Pasmikh mentioned this issue Jan 24, 2025

HuggingFace UI-TARS. Too many input tokens (422 Input validation error) web-infra-dev/midscene#315

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error following the huggingface hosting instructions. #8

Error following the huggingface hosting instructions. #8

derekalia commented Jan 22, 2025

AHEADer commented Jan 23, 2025

derekalia commented Jan 23, 2025 •

edited

Loading

AHEADer commented Jan 23, 2025

derekalia commented Jan 23, 2025 •

edited

Loading

AHEADer commented Jan 23, 2025

AHEADer commented Jan 24, 2025

Error following the huggingface hosting instructions. #8

Error following the huggingface hosting instructions. #8

Comments

derekalia commented Jan 22, 2025

AHEADer commented Jan 23, 2025

derekalia commented Jan 23, 2025 • edited Loading

AHEADer commented Jan 23, 2025

derekalia commented Jan 23, 2025 • edited Loading

AHEADer commented Jan 23, 2025

AHEADer commented Jan 24, 2025

derekalia commented Jan 23, 2025 •

edited

Loading

derekalia commented Jan 23, 2025 •

edited

Loading