Audio buffer is not finite everywhere #24

William-N-Havard · 2022-04-28T07:42:33Z

I sometimes run into issues such as the following. I don't really understand why as upon listening to the file everything looks normal and the VTC & VCM work perfectly on it.

  Traceback (most recent call last):
    File "/scratch2/whavard/PACKAGES/ALICE/SylNet/run_SylNet.py", line 104, in <module>
      X[i] = np.transpose(20*np.log10(librosa.feature.melspectrogram(y=y, sr=Fs, n_mels=24, n_fft=w_l, hop_length=w_h)))
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/feature/spectral.py", line 2004, in melspectrogram
      pad_mode=pad_mode,
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python.6/site-packages/librosa/core/spectrum.py", line 2519, in _spectrogram
      pad_mode=pad_mode,
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/core/spectrum.py", line 217, in stft
      util.valid_audio(y)
    File "/scratch2/whavard/.conda/envs/ALICE/lib/python3.6/site-packages/librosa/util/utils.py", line 310, in valid_audio
      raise ParameterError("Audio buffer is not finite everywhere")
  librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere

orasanen · 2022-04-28T13:01:49Z

That's strange, never seen that before and not sure how to reproduce it without the data. I did some googling and at least there was some suggestion that Librosa might throw that kind of error for NaN inputs. Could you check if the y in run_SylNet (the read waveform) is all finite values, i.e., no NaNs or Infs? Another thing to try, which should not affect the results, is to add some very small white noise floor to the y (i.e., just adding a vector of very small random numbers with zero mean)? This might help in the case the signal has exact zeros at signal onset/offset, which sometimes throws feature extractors off.

William-N-Havard · 2022-05-05T15:06:22Z

The VTC (or rather pyannote-audio) also uses librosa so I'm not sure why it only occurs when running ALICE. But indeed, there are some np.inf in the audio files that raise the exception. I'll try adding some noise and see how it goes!

orasanen · 2022-05-06T06:11:58Z

Linked this issue also on SylNet repo side.

orasanen · 2022-05-06T06:36:52Z

Another thing that is also perhaps possible: VTC is used as a front-end for SylNet (and other feature extraction) to split the input long-form data into "utterances". So, if VTC produces a segment that is not sufficiently long for Librosa feature extractor, that might cause an error. If this is the case, I could add some minimum duration threshold for the data splitting stage.

orasanen self-assigned this May 6, 2022

orasanen mentioned this issue May 6, 2022

Inf issue in context of ALICE shreyas253/SylNet#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio buffer is not finite everywhere #24

Audio buffer is not finite everywhere #24

William-N-Havard commented Apr 28, 2022

orasanen commented Apr 28, 2022

William-N-Havard commented May 5, 2022

orasanen commented May 6, 2022

orasanen commented May 6, 2022

Audio buffer is not finite everywhere #24

Audio buffer is not finite everywhere #24

Comments

William-N-Havard commented Apr 28, 2022

orasanen commented Apr 28, 2022

William-N-Havard commented May 5, 2022

orasanen commented May 6, 2022

orasanen commented May 6, 2022