Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Level 0 generation fails to generate past input audio #3

Open
camjac251 opened this issue May 11, 2020 · 4 comments
Open

Level 0 generation fails to generate past input audio #3

camjac251 opened this issue May 11, 2020 · 4 comments

Comments

@camjac251
Copy link

I thought maybe the first time I ran this it could've just been an error on my machine but it happened twice.

I tried to run my own primed audio with this software with the following options --model=5b_lyrics --name=sample_5b_prompted levels=3 --mode=primed --audio_file=myfullsong.wav --prompt_length_in_seconds=12 --sample_length_in_seconds 90 --total_sample_length_in_seconds=193 --sr=44100 --n_samples 1 --hop_fractions=0.5,0.5,0.125
When it reaches Sampling level 0, it seems to exit without any sampling actually happening. Both Level 2 and Level 1 have sampling but level0 does not and only generates the prompt length, nothing further.

I noticed that when running the default 20 second sample length, it would generate 30 seconds for both level 2 and level 1 but generate 19 for level0. Could this be related?

@andorxornot
Copy link

yeah, the same issue :(

@andorxornot
Copy link

andorxornot commented May 11, 2020

my command is
python jukebox/sample.py --model=1b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=180 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
i've noticed that it works on a small length, but seems not for >60

@andorxornot
Copy link

Traceback (most recent call last): File "jukebox/sample.py", line 270, in <module> fire.Fire(run) File "D:\Anaconda3\lib\site-packages\fire\core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "D:\Anaconda3\lib\site-packages\fire\core.py", line 366, in _Fire component, remaining_args) File "D:\Anaconda3\lib\site-packages\fire\core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "jukebox/sample.py", line 267, in run save_samples(model, device, hps, sample_hps) File "jukebox/sample.py", line 235, in save_samples ancestral_sample(labels, sampling_kwargs, priors, hps) File "jukebox/sample.py", line 130, in ancestral_sample zs = _sample(zs, labels, sampling_kwargs, priors, sample_levels, hps) File "jukebox/sample.py", line 114, in _sample x = prior.decode(zs[level:], start_level=level, bs_chunks=zs[level].shape[0]) File "D:\Developer\Python\audio\jukebox-master\jukebox\prior\prior.py", line 221, in decode x_out = self.decoder(zs, start_level=start_level, end_level=end_level, bs_chunks=bs_chunks) File "D:\Developer\Python\audio\jukebox-master\jukebox\vqvae\vqvae.py", line 118, in decode x_out = self._decode(zs_i, start_level=start_level, end_level=end_level) File "D:\Developer\Python\audio\jukebox-master\jukebox\vqvae\vqvae.py", line 109, in _decode x_out = decoder(x_quantised, all_levels=False) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Developer\Python\audio\jukebox-master\jukebox\vqvae\encdec.py", line 124, in forward x = level_block(x) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Developer\Python\audio\jukebox-master\jukebox\vqvae\encdec.py", line 46, in forward return self.model(x) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 100, in forward input = module(input) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 202, in forward self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size

@HelixNGC7293
Copy link

HelixNGC7293 commented Sep 29, 2020

I figured this out! It's actually because in 'make_model.py' line 142:
"rescale = lambda z_shape: (z_shape[0]*hps.n_ctx//vqvae.z_shapes[hps.level][0],)"
z_shape[0]*hps.n_ctx is larger than int32 (2,147,483,647) and become a negative number when the 'sample_length_in_seconds' is over 54s
【Solution】I changed line 53 in 'vqvae.py'
from "self.hop_lengths = np.cumprod(self.downsamples)"
to "self.hop_lengths = np.cumprod(self.downsamples, dtype=np.int64)"
This works for me! Just use int64 type for z_shapes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants