Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work #1427

asusdisciple · 2023-07-05T15:52:26Z

I am trying to use the speaker-diarzation pipeline offline. The problem occurs when I try to load the model for speaker
embeddings. I found out that the problem is, that the model is from Speechbrain (https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb/tree/main) which is also used in the pyannote speaker-diarization config.yaml on huggingface:

pipeline:
  name: pyannote.audio.pipelines.SpeakerDiarization
  params:
    clustering: AgglomerativeClustering
    embedding: speechbrain/spkrec-ecapa-voxceleb
    embedding_batch_size: 32
    embedding_exclude_overlap: true
    segmentation: pyannote/[email protected]
    segmentation_batch_size: 32

I looked into the. pkl file and it seems like pyannote tags the models with a string which is not found inside the speechbrain models when pyannote tries to extract the module name with module_name: str = loaded_checkpoint["pyannote.audio"]["architecture"]["module"] .
So if I use a pyannote speaker embedding model everything works fine (tested it) but if I try to run the speaker-diarization pipeline offline with the aforementioned speechbrain model it will not work.

Maybe you have an idea for a workaround? The error related to this issue is:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[164], line 9
      6 audio = Audio(sample_rate=16000, mono="downmix")
      8 #emb_model = Model.from_pretrained("config/config_pyannote.yaml")
----> 9 pipeline = Pipeline.from_pretrained("config/config_pyannote.yaml")
     11 # Diarization to Annotation object
     12 diarization = pipeline(file)

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/core/pipeline.py:126, in Pipeline.from_pretrained(cls, checkpoint_path, hparams_file, use_auth_token, cache_dir)
    124 params = config["pipeline"].get("params", {})
    125 params.setdefault("use_auth_token", use_auth_token)
--> 126 pipeline = Klass(**params)
    128 # freeze  parameters
    129 if "freeze" in config:

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_diarization.py:163, in SpeakerDiarization.__init__(self, segmentation, segmentation_duration, segmentation_step, embedding, embedding_exclude_overlap, clustering, embedding_batch_size, segmentation_batch_size, der_variant, use_auth_token)
    160     metric = "not_applicable"
    162 else:
--> 163     self._embedding = PretrainedSpeakerEmbedding(
    164         self.embedding, device=emb_device, use_auth_token=use_auth_token
    165     )
    166     self._audio = Audio(sample_rate=self._embedding.sample_rate, mono=True)
    167     metric = self._embedding.metric

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_verification.py:471, in PretrainedSpeakerEmbedding(embedding, device, use_auth_token)
    468     return NeMoPretrainedSpeakerEmbedding(embedding, device=device)
    470 else:
--> 471     return PyannoteAudioPretrainedSpeakerEmbedding(
    472         embedding, device=device, use_auth_token=use_auth_token
    473     )

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/pipelines/speaker_verification.py:391, in PyannoteAudioPretrainedSpeakerEmbedding.__init__(self, embedding, device, use_auth_token)
    388 self.embedding = embedding
    389 self.device = device
--> 391 self.model_: Model = get_model(self.embedding, use_auth_token=use_auth_token)
    392 self.model_.eval()
    393 self.model_.to(self.device)

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/pipelines/utils/getter.py:75, in get_model(model, use_auth_token)
     72     pass
     74 elif isinstance(model, Text):
---> 75     model = Model.from_pretrained(
     76         model, use_auth_token=use_auth_token, strict=False
     77     )
     79 elif isinstance(model, Mapping):
     80     model.setdefault("use_auth_token", use_auth_token)

File ~/PycharmProjects/envs/diary/lib/python3.10/site-packages/pyannote/audio/core/model.py:853, in Model.from_pretrained(cls, checkpoint, map_location, hparams_file, strict, use_auth_token, cache_dir, **kwargs)
    851 # obtain model class from the checkpoint
    852 loaded_checkpoint = pl_load(path_for_pl, map_location=map_location)
--> 853 module_name: str = loaded_checkpoint["pyannote.audio"]["architecture"]["module"]
    854 module = import_module(module_name)
    855 class_name: str = loaded_checkpoint["pyannote.audio"]["architecture"]["class"]

KeyError: 'pyannote.audio'

My local .yaml file looks like this:

pipeline:
  name: pyannote.audio.pipelines.SpeakerDiarization
  params:
    clustering: AgglomerativeClustering
    embedding: models/emb_pya.ckpt
    embedding_batch_size: 32
    embedding_exclude_overlap: true
    segmentation: models/seg_pya.bin
    segmentation_batch_size: 32

params:
  clustering:
    method: centroid
    min_cluster_size: 15
    threshold: 0.7153814381597874
  segmentation:
    min_duration_off: 0.5817029604921046
    threshold: 0.4442333667381752

I call the pipeline with

pipeline = Pipeline.from_pretrained("config/config_pyannote.yaml")

All the names are correct and all models were downloaded from huggingface. Do you have any ideas why this could happen?

The text was updated successfully, but these errors were encountered:

github-actions · 2023-07-05T15:52:43Z

Thank you for your issue.You might want to check the FAQ if you haven't done so already.

Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read this first and update your request accordingly, if needed.

If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:

installation
data preparation
model download
etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

We also offer paid scientific consulting services around speaker diarization (and speech processing in general).

This is an automated reply, generated by FAQtory

haiderasad · 2023-08-10T08:42:49Z

@asusdisciple any luck ?having the same error

haiderasad · 2023-08-10T19:22:54Z

found the solution at #1294, basically for embedding path there has to be "speechbrain" in it like
embedding: /home/haider/Documents/speechbrain/spkrec-ecapa-voxceleb

stale · 2024-02-07T04:10:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

asusdisciple changed the title ~~Bug - Open model locally with config.yaml file~~ Bug - Can not load speechbrain model offline, because of tags by pyannote Jul 6, 2023

asusdisciple changed the title ~~Bug - Can not load speechbrain model offline, because of tags by pyannote~~ Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work Jul 6, 2023

maxoo1001 mentioned this issue Nov 20, 2023

SpeakerDiarization in offline mode can't work #1554

Closed

stale bot added the wontfix label Feb 7, 2024

asusdisciple closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work #1427

Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work #1427

asusdisciple commented Jul 5, 2023 •

edited

Loading

github-actions bot commented Jul 5, 2023

haiderasad commented Aug 10, 2023

haiderasad commented Aug 10, 2023

stale bot commented Feb 7, 2024

Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work #1427

Bug - Offline use of "speechbrain/spkrec-ecapa-voxceleb " does not work #1427

Comments

asusdisciple commented Jul 5, 2023 • edited Loading

github-actions bot commented Jul 5, 2023

haiderasad commented Aug 10, 2023

haiderasad commented Aug 10, 2023

stale bot commented Feb 7, 2024

asusdisciple commented Jul 5, 2023 •

edited

Loading