Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with training on my own dataset. #20

Open
rishabhjain16 opened this issue Feb 15, 2022 · 0 comments
Open

Issue with training on my own dataset. #20

rishabhjain16 opened this issue Feb 15, 2022 · 0 comments

Comments

@rishabhjain16
Copy link

I have been trying to train Fragment VC model on my own dataset. It works fine with VCTK Dataset, but when I try it with my own dataset, I get the following error. Maybe it has something to do with my dataset and structure. It a non-native English that I am using as my dataset, so I want to find out if I can do VC from say librispeech to non-native English and vice versa. I get the following error and I am not quite sure how to fix it.

root@06089af1684b:/workspace/vc/FragmentVC# CUDA_VISIBLE_DEVICES=1 python train.py features_myst --s
ave_dir ./ckpts_myst --batch_size 16 --preload
100% 17163/17163 [00:18<00:00, 913.63it/s]
Train:   0% 0/1000 [00:00<?, ? step/s]Traceback (most recent call last):
  File "train.py", line 247, in <module>
    main(**parse_args())
  File "train.py", line 166, in main
    batch = next(train_iterator)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
    return self._process_data(data)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
    data.reraise()
  File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 272, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/workspace/vc/FragmentVC/data/intra_speaker_dataset.py", line 73, in __getitem__
    for sampled_id in random.sample(utterance_indices, self.n_samples):
  File "/opt/conda/lib/python3.8/random.py", line 363, in sample
    raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative

Train:   0% 0/1000 [00:01<?, ? step/s]

I think it most probably have something to do with the structure of my dataset. Or something to do with length of audio files? I tried looking around but didn't find any working solutions. Any help is appreciated. Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant