Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Training on Commonvoice Spanish #165

Closed
hulsmeier opened this issue Sep 7, 2023 · 0 comments
Closed

Error Training on Commonvoice Spanish #165

hulsmeier opened this issue Sep 7, 2023 · 0 comments

Comments

@hulsmeier
Copy link

hulsmeier commented Sep 7, 2023

python3.10 bin/trainer.py --max-duration 40 --filter-min-duration 0.5 --filter-max-duration 14 --train-stage 1 --num-buckets 6 --dtype bfloat16 --save-every-n 2500 --valid-interval 2500 --model-name valle --share-embedding true --norm-first true --add-prenet false --decoder-dim 1024 --nhead 16 --num-decoder-layers 12 --prefix-mode 1 --base-lr 0.05 --warmup-steps 200 --average-period 0 --num-epochs 70 --start-epoch 1 --start-batch 0 --accumulate-grad-steps 4 --keep-last-k 40 --exp-dir exp/valle --manifest-dir data/tokenized --text-tokens data/tokenized/unique_text_tokens.k2symbols --oom-check false --dataset commonvoice --world-size 1

2023-09-07 20:44:33,335 INFO [trainer.py:1092] Saving batch to exp/valle/batch-bdd640fb-0667-1ad1-1c80-317fa3b1799d.pt
Traceback (most recent call last):
  File "/home/ubuntu/vall-e/egs/commonvoice/bin/trainer.py", line 1161, in <module>
    main()
  File "/home/ubuntu/vall-e/egs/commonvoice/bin/trainer.py", line 1154, in main
    run(rank=0, world_size=1, args=args)
  File "/home/ubuntu/vall-e/egs/commonvoice/bin/trainer.py", line 1043, in run
    train_one_epoch(
  File "/home/ubuntu/vall-e/egs/commonvoice/bin/trainer.py", line 660, in train_one_epoch
    _, loss, loss_info = compute_loss(
  File "/home/ubuntu/vall-e/egs/commonvoice/bin/trainer.py", line 525, in compute_loss
    predicts, loss, metrics = model(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/vall-e/valle/models/valle.py", line 813, in forward
    y, targets = self.pad_y_eos(
  File "/home/ubuntu/vall-e/valle/models/valle.py", line 325, in pad_y_eos
    targets = F.pad(y, (0, 1), value=0) + eos_id * F.pad(
RuntimeError: The size of tensor a (716) must match the size of tensor b (2) at non-singleton dimension 1

I'm only using the spanish dataset. Running on a single A10 gpu.

I'm using this PR #111 from @RuntimeRacer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants