Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugs] LLaVA-Evaluation : RuntimeError: Expected all tensors to be on the same device #21

Open
JJJYmmm opened this issue Jan 31, 2024 · 2 comments · May be fixed by #22
Open

[bugs] LLaVA-Evaluation : RuntimeError: Expected all tensors to be on the same device #21

JJJYmmm opened this issue Jan 31, 2024 · 2 comments · May be fixed by #22

Comments

@JJJYmmm
Copy link

JJJYmmm commented Jan 31, 2024

Problems

When testing LLaVA-v1.5 with eval.py, the following error occurs.

*** RuntimeError: Expected all tensors to be on the same device, but found at least two devices, 
cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

This is because when using huggingface to load the model, the default parameter device_map="auto", the model will be loaded to multiple GPUs (Pipeline Parallelism).

def load_pretrained_model(model_path, model_base, model_name, \
load_8bit=False, load_4bit=False, device_map="auto", device="cuda", **kwargs):
    ...

While in eval.py, the wrapped model(MLLM_Tester) will be called the cuda method, and the model parameters will be loaded to the default gpu again.

model = build_model(args.model).cuda()

With the AlignDevicesHook conflict, the data is loaded to other gpus in some layer, and now all the parameters are on the default gpu, which triggers the error report.

Solution

I think removing .cuda() here is ok, though I only check the llava interface.

model = build_model(args.model).cuda()

JJJYmmm added a commit to JJJYmmm/SEED-Bench that referenced this issue Jan 31, 2024
@JJJYmmm JJJYmmm linked a pull request Jan 31, 2024 that will close this issue
@YUECHE77
Copy link

YUECHE77 commented Dec 9, 2024

I encounter the same issue. And thank you for the solution!

By the way, do you know how to run inference on multiple gpu?

Thanks!

@JJJYmmm
Copy link
Author

JJJYmmm commented Dec 15, 2024

I encounter the same issue. And thank you for the solution!

By the way, do you know how to run inference on multiple gpu?

Thanks!

If you refers to batch inference, I didn't implement it >_<.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants