-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read to download sample wav.scp file(include pipe sox) #27
Comments
Thank you for using our tool. Could show me the error log? |
Thank you for your reply, |
Maybe your wav file has some problem. kaldiio just uses scipy for loading wav file, so you can check it as following: /usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample > out.wav
python
>>> import scipy.io.wavfile
>>> scipy.io.wavfile.read('out.wav') |
Thanks for your reply.
|
Your wave file has incorrect file size information in the header and scipy.io.wavfile doesn't support such wave file.
I changed to use |
Thank you, I upgraded the kaldiio library as you suggested. |
Maybe, simple reading without Could you tell me more information in your case - how long are each wave files and how long are segments in the wave files? If you could, attaching the scp file and semgents would help me. |
Thank you for your reply. |
I created test set almost matching your corpus, but in my environment, it doesn't take such a long time. It performed as same speed as kaldi itself. I was curious that your logging included tensorflow's message.Are you trying to extract the feature from wavfile in training script? In general, the invoking subprocess takes much long time if a large mount of memory are allocated. For example, import numpy
import subprocess
import time
t = time.time()
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]')
x = numpy.ones((100000000,))
t = time.time()
# Take much more time
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]') This is not the fault of python's subprocess, but fork() system call has such feature. |
Thanks for your reply, I'm going to check code somewhere else. |
Hi all,
I want to use the kaldiio library to read wav.scp and segments file,but in wav.scp file,It contains pipe commands like the following:
ui23faz_0101 /usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample |"
the kaldiio reader is not working. Does kaldiio not support such wav.scp?
The text was updated successfully, but these errors were encountered: