Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLLM Inference Outputs Not Matching #1

Open
ramaneswaran opened this issue Dec 24, 2024 · 1 comment
Open

VLLM Inference Outputs Not Matching #1

ramaneswaran opened this issue Dec 24, 2024 · 1 comment

Comments

@ramaneswaran
Copy link

Hi team,

First of all, thank you for releasing this benchmark and the sample inference code—it’s been incredibly helpful.

I’m currently trying to benchmark some methods on REPOCOD but am encountering difficulties reproducing the results. Specifically, I’m using VLLM for generation, and the outputs differ from those produced via direct inference using HuggingFace Transformers.

Here’s a snippet of the inference code I’m using with VLLM:

    model_path = 'deepseek-coder-6.7b-base'
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    custom_stop_token = tokenizer.encode("```", add_special_tokens=False)[0]
    sampling_params = SamplingParams(temperature=1.0, top_k=50, top_p=1.0, max_tokens=4096, stop_token_ids=[custom_stop_token])

    llm = LLM(model=model_path, dtype=torch.float16, max_model_len=16_384, enforce_eager=True)

    current_file_prompt = current_file_template.format(sample['target_module_path'], prefix, suffix, sample['prompt'])
    input_text = f"{SYSTEM_PROMPT}\n{current_file_prompt}"

    output = llm.generate([input_text], sampling_params)  

Could you please:

  1. Share the inference code you used with VLLM?
  2. Suggest any modifications or configurations I might have overlooked?
@shanchaoL
Copy link
Collaborator

Hi ramaneswaran,

Thank you for your interest in REPOCOD and for sharing your inference setup.

In our experiments, we used greedy decoding for inference, which might explain the differences you're observing.

As for the inference code, we plan to release our inference code along with the set up of our three retrieval methods in an upcoming update to the repository. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants