Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio buffer is not finite everywhere #24

Open
William-N-Havard opened this issue Apr 28, 2022 · 4 comments
Open

Audio buffer is not finite everywhere #24

William-N-Havard opened this issue Apr 28, 2022 · 4 comments
Assignees

Comments

@William-N-Havard
Copy link

I sometimes run into issues such as the following. I don't really understand why as upon listening to the file everything looks normal and the VTC & VCM work perfectly on it.

  Traceback (most recent call last):
    File "/scratch2/whavard/PACKAGES/ALICE/SylNet/run_SylNet.py", line 104, in <module>
      X[i] = np.transpose(20*np.log10(librosa.feature.melspectrogram(y=y, sr=Fs, n_mels=24, n_fft=w_l, hop_length=w_h)))
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/feature/spectral.py", line 2004, in melspectrogram
      pad_mode=pad_mode,
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python.6/site-packages/librosa/core/spectrum.py", line 2519, in _spectrogram
      pad_mode=pad_mode,
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/core/spectrum.py", line 217, in stft
      util.valid_audio(y)
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/util/utils.py", line 310, in valid_audio
      raise ParameterError("Audio buffer is not finite everywhere")
  librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere
@orasanen
Copy link
Owner

That's strange, never seen that before and not sure how to reproduce it without the data. I did some googling and at least there was some suggestion that Librosa might throw that kind of error for NaN inputs. Could you check if the y in run_SylNet (the read waveform) is all finite values, i.e., no NaNs or Infs? Another thing to try, which should not affect the results, is to add some very small white noise floor to the y (i.e., just adding a vector of very small random numbers with zero mean)? This might help in the case the signal has exact zeros at signal onset/offset, which sometimes throws feature extractors off.

@William-N-Havard
Copy link
Author

The VTC (or rather pyannote-audio) also uses librosa so I'm not sure why it only occurs when running ALICE. But indeed, there are some np.inf in the audio files that raise the exception. I'll try adding some noise and see how it goes!

@orasanen
Copy link
Owner

orasanen commented May 6, 2022

Linked this issue also on SylNet repo side.

@orasanen
Copy link
Owner

orasanen commented May 6, 2022

Another thing that is also perhaps possible: VTC is used as a front-end for SylNet (and other feature extraction) to split the input long-form data into "utterances". So, if VTC produces a segment that is not sufficiently long for Librosa feature extractor, that might cause an error. If this is the case, I could add some minimum duration threshold for the data splitting stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants