Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwen2 vllm和transform 推理结果未对齐 #1147

Open
Qyijiu opened this issue Dec 25, 2024 · 2 comments
Open

qwen2 vllm和transform 推理结果未对齐 #1147

Qyijiu opened this issue Dec 25, 2024 · 2 comments
Labels

Comments

@Qyijiu
Copy link

Qyijiu commented Dec 25, 2024

VLLM 0.6.5
变压器 4.41.2

vllm:
import os
os.environ[“CUDA_VISIBLE_DEVICES”] = “1”
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
tokenizer = AutoTokenizer.from_pretrained(“/data/models/Qwen2-7B-Instruct”)
sampling_params = SamplingParams(temperature=0.0, repetition_penalty=1.0, max_tokens=2048,best_of=1, top_k=-1, top_p=1)
llm = LLM(model=“/data/models/Qwen2-7B-Instruct”,
dtype='float16',
gpu_memory_utilization=0.9,
enforce_eager=真,
trust_remote_code=真
)

prompt = “给我一个大型语言模型的简短介绍。”
messages = [
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True

outputs = llm.generate([text], sampling_params)
对于输出中的输出:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f“提示符: {prompt!r}, 生成的文本: {generated_text!r}”)

hf:
import os
os.environ[“CUDA_VISIBLE_DEVICES”]=“1”
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = “cuda”
model_path = “/data/models/Qwen2-7B-Instruct”
def huggingface(messages):
device = “cuda” # 将模型加载到 model 上的
设备 = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=“float16”,
device_map=“auto”

tokenizer = AutoTokenizer.from_pretrained(model_path)
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True

model_inputs = tokenizer([text], return_tensors=“pt”).to(device)
print(model_inputs)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=2048,
do_sample=False,
num_beams=1,
temperature=0,
repetition_penalty=1.0,

generated_ids = [
output_ids[len(input_ids):] 对于input_ids,output_ids 在 zip(model_inputs.input_ids, generated_ids) 中
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
prompt = “给我一个大型语言模型的简短介绍。”
messages = [
{“role”: “system”, “content”: “你是个有用的助手。”},
{“role”: “user”, “content”: prompt}
]
huggingface(messages)

vllm 结果:
大型语言模型是一种人工智能 (AI) 模型,它经过大量文本数据的训练,可以理解和生成类似人类的语言。这些模型通常由多层相互连接的人工神经元组成,这些神经元处理输入数据并将其转换为输出预测。\n\n训练过程包括向模型提供大型数据集,例如书籍、文章和网页,以便它可以学习文本中的模式和关系。经过训练后,这些模型可以生成连贯且与上下文相关的文本,使其可用于各种应用,例如语言翻译、文本摘要和聊天机器人开发。一些最著名的大型语言模型包括 GPT(生成式预训练转换器)、BERT(来自 Transformer 的双向编码器表示)和 T5(文本到文本传输转换器)。这些模型在各种自然语言处理 (NLP) 任务中取得了令人印象深刻的成果,并已广泛应用于金融、医疗保健、营销和娱乐等各个行业。

hf 结果:
大型语言模型是一种人工智能 (AI) 模型,专门用于理解和生成人类语言。这些模型在大量文本数据上进行训练,使它们能够学习语言的模式和结构,并生成类似于人类书写文本的文本。
大型语言模型可用于各种任务,包括语言翻译、文本摘要、问答和文本生成。它们通常用于自然语言处理 (NLP) 应用程序,例如聊天机器人、虚拟助手和语言理解系统。
大型语言模型的主要特征之一是它们能够生成连贯且与上下文相关的文本。这是通过使用深度学习算法实现的,深度学习算法允许模型从大量数据中学习,并根据这些数据中的模式和关系进行预测。
总体而言,大型语言模型是理解和生成人类语言的强大工具,它们有可能彻底改变我们与技术以及彼此交互的方式。

我应该如何让他们一致?

@Qyijiu Qyijiu changed the title wen2 vllm和transform 推理结果未对齐 qwen2 vllm和transform 推理结果未对齐 Dec 26, 2024
@jklj077
Copy link
Collaborator

jklj077 commented Jan 3, 2025

there are many sources of randomness, e.g.:

  1. the random number generator (RNG): if you're using pseudo-RNG, it could control by using a fixed seed.
  2. the difference in implementations: the implementation are not guaranteed to be the same. always using the same framework could help.
  3. the accuracy problems of floating-point arithmetic: in particular, floating-point addition and mulitplication are not necessarily associative. if the order of execuation is random, the result may be different. using higher precisions (e.g., float32) or deterministic algorithms may help (https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms).

in general, those factors do not affect evaluation significantly.

Copy link

github-actions bot commented Feb 2, 2025

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants