Inference is only using CPU while GPU present? #1418

riebeb · 2023-06-26T13:41:35Z

Hey!

Plattform:
docker image nvcr.io/nvidia/pytorch:23.02-py3

Install:
git clone
pip install -e .[dev,testing]
pre-commit install

Using:
` 1. visit hf.co/pyannote/speaker-diarization and accept user conditions
2. visit hf.co/pyannote/segmentation and accept user conditions
3. visit hf.co/settings/tokens to create an access token
4. instantiate pretrained speaker diarization pipeline
import time
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/[email protected]",
use_auth_token="my_hf_token")

start = time.time()

apply the pipeline to an audio file
diarization = pipeline("HufeisenPortraitLang-s.mp3")

dump the diarization output to disk using RTTM format
with open("audio.rttm", "w") as rttm:
diarization.write_rttm(rttm)

end = time.time()
print(end - start)`

Getting:
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.4. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint --file ../root/.cache/torch/pyannote/models--pyannote--segmentation/snapshots/c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b/pytorch_model.bin

Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.1+cu117. Bad things might happen unless you revert torch to 1.x.

The 14 min mp3 takes approx 700 sec to inferece on cpu

torch.cuda.is_available()
true
torch.cuda.current_device()
0
Memory-Usage
312MiB / 32508MiB

**What to do? enable_checkpointing=False ? **

The text was updated successfully, but these errors were encountered:

github-actions · 2023-06-26T13:41:59Z

Thank you for your issue.You might want to check the FAQ if you haven't done so already.

Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read this first and update your request accordingly, if needed.

If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:

installation
data preparation
model download
etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

We also offer paid scientific consulting services around speaker diarization (and speech processing in general).

This is an automated reply, generated by FAQtory

bipowerhcmcity · 2023-06-27T04:07:02Z

Hi,
You can try to add the code:
pipeline.to(torch.device("cuda"))
to use the GPU device

riebeb · 2023-06-27T07:57:19Z

ahh that easy... i was reading about to.device("mps") and that to define a "a to gpu" would make no sense....
should have tried!
now it takes 97 sec for the 14 min audio.
Usage was 2700MiB / 32508MiB on a V100 on a DGX Station.

stale · 2023-12-24T08:58:05Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale bot added the wontfix label Dec 24, 2023

stale bot closed this as completed Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference is only using CPU while GPU present? #1418

Inference is only using CPU while GPU present? #1418

riebeb commented Jun 26, 2023 •

edited

Loading

github-actions bot commented Jun 26, 2023

bipowerhcmcity commented Jun 27, 2023

riebeb commented Jun 27, 2023

stale bot commented Dec 24, 2023

Inference is only using CPU while GPU present? #1418

Inference is only using CPU while GPU present? #1418

Comments

riebeb commented Jun 26, 2023 • edited Loading

github-actions bot commented Jun 26, 2023

bipowerhcmcity commented Jun 27, 2023

riebeb commented Jun 27, 2023

stale bot commented Dec 24, 2023

riebeb commented Jun 26, 2023 •

edited

Loading