Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read to download sample wav.scp file(include pipe sox) #27

Open
shanguanma opened this issue May 9, 2019 · 10 comments
Open

read to download sample wav.scp file(include pipe sox) #27

shanguanma opened this issue May 9, 2019 · 10 comments

Comments

@shanguanma
Copy link

Hi all,
I want to use the kaldiio library to read wav.scp and segments file,but in wav.scp file,It contains pipe commands like the following:
ui23faz_0101 /usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample |"
the kaldiio reader is not working. Does kaldiio not support such wav.scp?

@nttcslab-sp-admin
Copy link
Contributor

Thank you for using our tool. Could show me the error log?

@shanguanma
Copy link
Author

Thank you for your reply,
this is my error log:
Colocations handled automatically by placer.
2019-05-08 19:51:53.221278: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-05-08 19:51:53.225237: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000125000 Hz
2019-05-08 19:51:53.225357: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5555599691e0 executing computations on platform Host. Devices:
2019-05-08 19:51:53.225375: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): ,
2019-05-08 19:51:53.225462: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
/usr/bin/sox WARN wav: Premature EOF on .wav input file
/usr/bin/sox WARN rate: rate clipped 17 samples; decrease volume?
/usr/bin/sox WARN dither: dither clipped 12 samples; decrease volume?
/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/utils.py:320: UserWarning: An error happens at loading "/usr/bin/sox /home4/md510/w2018/original_seame/wavdata/interview/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample |"
'An error happens at loading "{}"'.format(ark_name))
Traceback (most recent call last):
File "local/compute-fbank-feats.py", line 93, in
main()
File "local/compute-fbank-feats.py", line 81, in main
for utt_id, (rate, array) in reader:
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/highlevel.py", line 128, in iter
k, v = next(self.generator)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/matio.py", line 115, in load_scp_sequential
segments=segments).generator():
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/matio.py", line 162, in generator
cached[recodeid] = self.wav_loader[recodeid]
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/utils.py", line 317, in getitem
return self._loader(ark_name)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/matio.py", line 205, in load_mat
use_scipy_wav=offset is None)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/matio.py", line 265, in _load_mat
array = read_kaldi(fd, endian, use_scipy_wav=use_scipy_wav)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/matio.py", line 334, in read_kaldi
array, size = read_wav_scipy(fd, return_size=True)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/kaldiio/wavio.py", line 44, in read_wav_scipy
rate, array = wavfile.read(fd)
File "/home3/md510/anaconda3/lib/python3.7/site-packages/scipy/io/wavfile.py", line 246, in read
raise ValueError("Unexpected end of file.")
ValueError: Unexpected end of file.

@nttcslab-sp-admin
Copy link
Contributor

Maybe your wav file has some problem. kaldiio just uses scipy for loading wav file, so you can check it as following:

/usr/bin/sox /path/ui23faz_0102/ui23faz_0102.wav -r 16000 -c 1 -b 16 -t wav - downsample > out.wav
python
>>> import scipy.io.wavfile
>>> scipy.io.wavfile.read('out.wav')

@shanguanma
Copy link
Author

Thanks for your reply.
I use your method to test, my wav file is no problem.
The test results are as follows:
/usr/bin/sox /home4/md510/w2018/original_seame/wavdata/interview/ui23faz_0101/ui23faz_0101.wav -r 16000 -c 1 -b 16 -t wav - downsample > out.wav
/usr/bin/sox WARN rate: rate clipped 17 samples; decrease volume?
/usr/bin/sox WARN dither: dither clipped 17 samples; decrease volume?
python3

import scipy.io.wavfile
scipy.io.wavfile.read('out.wav')
(16000, array([ -1, 1, -1, ..., -17, -5, 4], dtype=int16))

@nttcslab-sp-admin
Copy link
Contributor

Your wave file has incorrect file size information in the header and scipy.io.wavfile doesn't support such wave file.

 /usr/bin/sox WARN wav: Premature EOF on .wav input file

I changed to use wave module in new kaldiio now. Try pip install -U kaldiio.

@shanguanma
Copy link
Author

Thank you, I upgraded the kaldiio library as you suggested.
In addition, mel-fbank is generated in 6-hour small data set and written into kaldi's ark and SCP file format. It is generated in 10 processes, one hour and four minutes. But I switched to a larger data set (96 hours) and 32 processes. The program has not finished running for 30 hours. Is it the beginning of kaldiio's reading and writing efficiency slowly changing with time?

@nttcslab-sp-admin
Copy link
Contributor

Maybe, simple reading without segments file can performs not so slowly comparing with kaldi, because it is just using subprocess for invoking commands and scipy/python-wave, but I haven't optimized it for segments.

Could you tell me more information in your case - how long are each wave files and how long are segments in the wave files? If you could, attaching the scp file and semgents would help me.

@shanguanma
Copy link
Author

Thank you for your reply.
I used this 96-hour data set and it worked well in kaldi, but I used the read-write matrix interface of kaldiio to run for three days without extracting the features. According to your request, I explained my data set, the wave length is about 1-2 hours, and the segments length is about 2-7 seconds.

@nttcslab-sp-admin
Copy link
Contributor

I created test set almost matching your corpus, but in my environment, it doesn't take such a long time. It performed as same speed as kaldi itself.

I was curious that your logging included tensorflow's message.Are you trying to extract the feature from wavfile in training script?

In general, the invoking subprocess takes much long time if a large mount of memory are allocated.

For example,

import numpy
import subprocess
import time

t = time.time()
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]')

x = numpy.ones((100000000,))
t = time.time()
# Take much more time
subprocess.run('echo hello', shell=True)
print(f'{time.time() - t} [x]')

This is not the fault of python's subprocess, but fork() system call has such feature.
Thus, if you'll invoke sox via wav.scp, you need to care not to allocate extra memory as possible.

@shanguanma
Copy link
Author

Thanks for your reply, I'm going to check code somewhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants