Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多gpu 的时候运行ppo_training.py报错, #420

Open
cqray1990 opened this issue Sep 3, 2024 · 1 comment
Open

多gpu 的时候运行ppo_training.py报错, #420

cqray1990 opened this issue Sep 3, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@cqray1990
Copy link

if self.is_encoder_decoder:
            input_ids = input_kwargs["decoder_input_ids"]
            attention_mask = input_kwargs["decoder_attention_mask"]
        else:
            input_ids = input_kwargs["input_ids"]
            attention_mask = input_kwargs["attention_mask"]

        logprobs = logprobs_from_logits(logits[:, :-1, :], input_ids[:, 1:])

/lib/python3.10/site-packages/trl/core.py", line 139, in logprobs_from_logits
logpy = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)

model device_map 设置为auto,奖励模型默认设置cuda:0上
model = AutoModelForCausalLMWithValueHead.from_pretrained(
args.model_name_or_path,
config=config,
torch_dtype=torch_dtype,
load_in_4bit=args.load_in_4bit,
load_in_8bit=args.load_in_8bit,
device_map=args.device_map,
trust_remote_code=args.trust_remote_code,
peft_config=peft_config if args.use_peft else None,
)

奖励模型默认设置cuda:0上
reward_model = AutoModelForSequenceClassification.from_pretrained(
args.reward_model_name_or_path,
config=reward_config,
load_in_8bit=args.load_in_8bit,
trust_remote_code=args.trust_remote_code,
)
reward_model.to(device)

model device_map 设置为auto,奖励模型默认设置cuda:0上,这样导致计算的时候不在统一gpu,大佬没遇到这种情况?多GPU的话一定会出现呢

@cqray1990 cqray1990 added the bug Something isn't working label Sep 3, 2024
@shibing624
Copy link
Owner

奖励模型设置cuda:0,ppo model set cuda:1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants