Skip to content

training error on colab #43

Open
Open
@hb0313

Description

@hb0313

My all setup is successful on colab for training. However, when I run

!python tools/train.py --cfg configs/CONFIG_FILE.yaml

I get error:

Found 20210 training images.
Found 2000 validation images.
Epoch: [1/500] Iter: [0/2526] LR: 0.00100000 Loss: 0.00000000: 0% 0/2526 [00:00<?, ?it/s]
Traceback (most recent call last):
File "tools/train.py", line 128, in
main(cfg, gpu, save_dir)
File "tools/train.py", line 69, in main
for iter, (img, lbl) in pbar:
File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 461, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/semantic-segmentation/semseg/datasets/ade20k.py", line 73, in getitem
image, label = self.transform(image, label)
File "/content/semantic-segmentation/semseg/augmentations.py", line 20, in call
img, mask = transform(img, mask)
File "/content/semantic-segmentation/semseg/augmentations.py", line 329, in call
mask = TF.pad(mask, padding, fill=self.seg_fill)
File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional.py", line 481, in pad
return F_t.pad(img, padding=padding, fill=fill, padding_mode=padding_mode)
File "/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_tensor.py", line 418, in pad
img = torch_pad(img, p, mode=padding_mode, value=float(fill))
RuntimeError: value cannot be converted to type uint8_t without overflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions