Errors for processing waymo infos: #29

Z-Lee-corder · 2023-09-07T13:15:44Z

Hello, I would like to run the code on waymo dataset. However, when I run the following two commands

"python -m al3d_det.datasets.waymo.waymo_preprocess --cfg_file tools/cfgs/det_dataset_cfgs/waymo_xxx_sweeps_mm.yaml --func create_waymo_infos",
python -m al3d_det.datasets.waymo.waymo_preprocess --cfg_file tools/cfgs/det_dataset_cfgs/waymo_xxxx_sweeps_mm.yaml --func create_waymo_database

The following error occurred:
2023-09-07 21:03:29.162028: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/media/lizheng/Samsung/codes/LoGoNet/detection/al3d_det/datasets/waymo/waymo_preprocess.py", line 355, in
create_waymo_database(
File "/media/lizheng/Samsung/codes/LoGoNet/detection/al3d_det/datasets/waymo/waymo_preprocess.py", line 304, in create_waymo_database
dataset = WaymoTrainingDataset(
File "/media/lizheng/Samsung/codes/LoGoNet/detection/al3d_det/datasets/waymo/waymo_dataset.py", line 51, in init
from petrel_client.client import Client
ModuleNotFoundError: No module named 'petrel_client'

When I remove "OSS_PATH: 'cluster2:s3://dataset/waymo" in "waymo_one_sweep_mm.yaml", the new error occurred:
Traceback (most recent call last):
File "/media/lizheng/Samsung/codes/LoGoNet/detection/al3d_det/datasets/waymo/waymo_preprocess.py", line 38, in get_infos_worker
sequence_infos = list(tqdm(executor.map(process_single_sequence, sample_sequence_file_list),
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/home/lizheng/anaconda3/envs/open-mmlab/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/media/lizheng/Samsung/codes/LoGoNet/detection/al3d_det/datasets/waymo/waymo_utils.py", line 218, in process_single_sequence_and_save
if pkl_file.exists():
AttributeError: 'str' object has no attribute 'exists'

May I ask what I should do?

CSautier · 2023-09-12T12:24:59Z

Replace if pkl_file.exists(): by if os.path.exists(pkl_file)
Later on you might also have to comment the line info_path = self.check_sequence_name_with_all_version(info_path) and replace
sequence_file_tfrecord = sequence_file[:-9] + '_with_camera_labels.tfrecord
by
sequence_file_tfrecord = sequence_file[:-9] + '.tfrecord'.
Those might only be true if you are not using ceph.

Z-Lee-corder · 2023-09-13T04:34:28Z

Replace if pkl_file.exists(): by if os.path.exists(pkl_file) Later on you might also have to comment the line info_path = self.check_sequence_name_with_all_version(info_path) and replace sequence_file_tfrecord = sequence_file[:-9] + '_with_camera_labels.tfrecord by sequence_file_tfrecord = sequence_file[:-9] + '.tfrecord'. Those might only be true if you are not using ceph.

Thank you for your reply. After following your modifications, the error has been eliminated. But when I run the command "python -m al3d_det.datasets.waymo.waymo_preprocess --cfg_file tools/cfgs/det_dataset_cfgs/waymo_one_sweep_mm.yaml --func create_waymo_infos", the CPU memory (64GB) is not enough.

May I know how to operate the codes properly?

Z-Lee-corder · 2023-09-13T04:54:47Z

Replace if pkl_file.exists(): by if os.path.exists(pkl_file) Later on you might also have to comment the line info_path = self.check_sequence_name_with_all_version(info_path) and replace sequence_file_tfrecord = sequence_file[:-9] + '_with_camera_labels.tfrecord by sequence_file_tfrecord = sequence_file[:-9] + '.tfrecord'. Those might only be true if you are not using ceph.

Previously, I processed waymo data in the official project of “OpenPCDet”. However, the processed waymo data does not contain images infomation. Is it not possible for me to use the these generated data files in this project (LoGoNet) now?

CSautier · 2023-09-13T08:49:34Z

Yes, I've found too that the waymo preprocessing cost a lot of memory. A partial solution would be to remove the multiprocessing by replacing

with futures.ThreadPoolExecutor(num_workers) as executor:
    sequence_infos = list(tqdm(executor.map(process_single_sequence, sample_sequence_file_list),
                               total=len(sample_sequence_file_list)))

by
sequence_infos = list([process_single_sequence(sample_sequence_file) for sample_sequence_file in tqdm(sample_sequence_file_list)])
however you have to know that this makes the process even slower.

Also if at some point you get it running, make absolutely sure it does save png files, as for me it didn't at first. You can for instance replace in waymo_utils the line
cv2.imwrite(image_path, all_images[cam_i]) by

if not cv2.imwrite(image_path, all_images[cam_i]):
    os.makedirs(os.path.join(cur_save_dir, 'image_{}'.format(cam_i)), exist_ok=True)
    cv2.imwrite(image_path, all_images[cam_i])

CSautier · 2023-09-13T08:50:52Z

As for using the OpenPCDet preprocessing I have no idea. I'm not affiliated with the authors of the code, I'm just trying to get the code running as well.

Z-Lee-corder · 2023-09-14T11:52:58Z

Yes, I've found too that the waymo preprocessing cost a lot of memory. A partial solution would be to remove the multiprocessing by replacing
with futures.ThreadPoolExecutor(num_workers) as executor:
    sequence_infos = list(tqdm(executor.map(process_single_sequence, sample_sequence_file_list),
                               total=len(sample_sequence_file_list)))
by sequence_infos = list([process_single_sequence(sample_sequence_file) for sample_sequence_file in tqdm(sample_sequence_file_list)]) however you have to know that this makes the process even slower.

Also if at some point you get it running, make absolutely sure it does save png files, as for me it didn't at first. You can for instance replace in waymo_utils the line cv2.imwrite(image_path, all_images[cam_i]) by
if not cv2.imwrite(image_path, all_images[cam_i]):
    os.makedirs(os.path.join(cur_save_dir, 'image_{}'.format(cam_i)), exist_ok=True)
    cv2.imwrite(image_path, all_images[cam_i])

Thank you very much for your patient answer. With your help, I can now process the data normally. But I found that the processed data is very large. Is it necessary to have at least 5T of storage space, as I have found that each frame scene has an additional 6 camera images added. At present, my storage capacity is only 3T, which is probably not enough.

CSautier · 2023-09-14T12:10:05Z

I can't tell for sure, but the waymo_one_sweep_mm.yaml seems to use a bit less than 3T. Maybe start with KITTI as it seems to be much lighter and probably easier to setup.

SISTMrL · 2023-09-15T08:23:30Z

I can't tell for sure, but the waymo_one_sweep_mm.yaml seems to use a bit less than 3T. Maybe start with KITTI as it seems to be much lighter and probably easier to setup.

hello, could you please tell me how long of the porcessing on waymo infos,? The program output log stayed in this interface for a long time

and the gpu memory i used is shown as below:

CSautier · 2023-09-18T08:32:43Z

The pre-processing last about 150 hours on my hardware, with no multiprocessing. I'm not sure why it uses any GPU memory, as far as I can tell the pre-processing is CPU-only. It seems to me that it opens all sequences, parses it, converts the range-view into point clouds, and saves individually the point cloud, images and annotations.

reynerliu · 2024-03-12T08:07:10Z

@CSautier thanks bro,i got trouble for this for a whole week. appreciate for your attribution!

SiHengHeHSH · 2024-03-15T07:23:59Z

assert img_file.exists()
AttributeError: 'NoneType' object has no attribute 'exists'

run ‘python -m al3d_det.datasets.waymo.waymo_preprocess --cfg_file tools/cfgs/det_dataset_cfgs/waymo_xxx_sweeps_mm.yaml --func create_waymo_infos' and 'python -m al3d_det.datasets.waymo.waymo_preprocess --cfg_file tools/cfgs/det_dataset_cfgs/waymo_xxxx_sweeps_mm.yaml --func create_waymo_database'. There are
not the path of '../data/waymo/waymo_processed_data_v4/segment-9509506420470671704_4049_100_4069_100_with_camera_labels/image_0/0034.png' and '../data/waymo/waymo_processed_data_v4/segment-9509506420470671704_4049_100_4069_100_with_camera_labels/image*' What should I do? Thake you for your answer. @CSautier There lack of image file of waymo data set.

kikiki-cloud · 2024-05-24T08:54:27Z

Yes, I've found too that the waymo preprocessing cost a lot of memory. A partial solution would be to remove the multiprocessing by replacing
with futures.ThreadPoolExecutor(num_workers) as executor:
    sequence_infos = list(tqdm(executor.map(process_single_sequence, sample_sequence_file_list),
                               total=len(sample_sequence_file_list)))
by sequence_infos = list([process_single_sequence(sample_sequence_file) for sample_sequence_file in tqdm(sample_sequence_file_list)]) however you have to know that this makes the process even slower.

Also if at some point you get it running, make absolutely sure it does save png files, as for me it didn't at first. You can for instance replace in waymo_utils the line cv2.imwrite(image_path, all_images[cam_i]) by
if not cv2.imwrite(image_path, all_images[cam_i]):
    os.makedirs(os.path.join(cur_save_dir, 'image_{}'.format(cam_i)), exist_ok=True)
    cv2.imwrite(image_path, all_images[cam_i])

Hello, I want to ask, why did my kitti data set report the following errors during training.
Traceback (most recent call last):
File "detection/tools/train.py", line 204, in
main()
File "detection/tools/train.py", line 153, in main
last_epoch=last_epoch, optim_cfg=cfg.OPTIMIZATION
File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/init.py", line 52, in build_scheduler
optimizer, total_steps, last_step, optim_cfg.LR, list(optim_cfg.MOMS), optim_cfg.DIV_FACTOR, optim_cfg.PCT_START
File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 85, in init
super().init(fai_optimizer, total_step, last_step, lr_phases, mom_phases)
File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 45, in init
self.step()
File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 58, in step
self.update_lr()
File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 51, in update_lr
self.optimizer.lr = func((step - start) / (end - start))
ZeroDivisionError: division by zero
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3213650) of binary: /home/linux/anaconda3/envs/logonet/bin/python
Traceback (most recent call last):
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

fangweicheng6 · 2024-07-04T03:22:36Z

Yes, I've found too that the waymo preprocessing cost a lot of memory. A partial solution would be to remove the multiprocessing by replacing
with futures.ThreadPoolExecutor(num_workers) as executor:
    sequence_infos = list(tqdm(executor.map(process_single_sequence, sample_sequence_file_list),
                               total=len(sample_sequence_file_list)))
by sequence_infos = list([process_single_sequence(sample_sequence_file) for sample_sequence_file in tqdm(sample_sequence_file_list)]) however you have to know that this makes the process even slower.
Also if at some point you get it running, make absolutely sure it does save png files, as for me it didn't at first. You can for instance replace in waymo_utils the line cv2.imwrite(image_path, all_images[cam_i]) by
if not cv2.imwrite(image_path, all_images[cam_i]):
    os.makedirs(os.path.join(cur_save_dir, 'image_{}'.format(cam_i)), exist_ok=True)
    cv2.imwrite(image_path, all_images[cam_i])
Hello, I want to ask, why did my kitti data set report the following errors during training. Traceback (most recent call last): File "detection/tools/train.py", line 204, in main() File "detection/tools/train.py", line 153, in main last_epoch=last_epoch, optim_cfg=cfg.OPTIMIZATION File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/init.py", line 52, in build_scheduler optimizer, total_steps, last_step, optim_cfg.LR, list(optim_cfg.MOMS), optim_cfg.DIV_FACTOR, optim_cfg.PCT_START File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 85, in init super().init(fai_optimizer, total_step, last_step, lr_phases, mom_phases) File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 45, in init self.step() File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 58, in step self.update_lr() File "/home/linux/guorong/qinhao/LoGoNet/utils/al3d_utils/optimize_utils/learning_schedules_fastai.py", line 51, in update_lr self.optimizer.lr = func((step - start) / (end - start)) ZeroDivisionError: division by zero ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3213650) of binary: /home/linux/anaconda3/envs/logonet/bin/python Traceback (most recent call last): File "/home/linux/anaconda3/envs/logonet/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in main() File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run )(*cmd_args) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/linux/anaconda3/envs/logonet/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

hello Have you solve your question？I met the same question.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors for processing waymo infos: #29

Errors for processing waymo infos: #29

Z-Lee-corder commented Sep 7, 2023 •

edited

Loading

CSautier commented Sep 12, 2023

Z-Lee-corder commented Sep 13, 2023

Z-Lee-corder commented Sep 13, 2023

CSautier commented Sep 13, 2023

CSautier commented Sep 13, 2023

Z-Lee-corder commented Sep 14, 2023

CSautier commented Sep 14, 2023

SISTMrL commented Sep 15, 2023 •

edited

Loading

CSautier commented Sep 18, 2023

reynerliu commented Mar 12, 2024

SiHengHeHSH commented Mar 15, 2024 •

edited

Loading

kikiki-cloud commented May 24, 2024

fangweicheng6 commented Jul 4, 2024

Errors for processing waymo infos: #29

Errors for processing waymo infos: #29

Comments

Z-Lee-corder commented Sep 7, 2023 • edited Loading

CSautier commented Sep 12, 2023

Z-Lee-corder commented Sep 13, 2023

Z-Lee-corder commented Sep 13, 2023

CSautier commented Sep 13, 2023

CSautier commented Sep 13, 2023

Z-Lee-corder commented Sep 14, 2023

CSautier commented Sep 14, 2023

SISTMrL commented Sep 15, 2023 • edited Loading

CSautier commented Sep 18, 2023

reynerliu commented Mar 12, 2024

SiHengHeHSH commented Mar 15, 2024 • edited Loading

kikiki-cloud commented May 24, 2024

fangweicheng6 commented Jul 4, 2024

Z-Lee-corder commented Sep 7, 2023 •

edited

Loading

SISTMrL commented Sep 15, 2023 •

edited

Loading

SiHengHeHSH commented Mar 15, 2024 •

edited

Loading