Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop while loading the lidar folders! #280

Open
amanysh99 opened this issue Feb 27, 2025 · 11 comments
Open

Infinite loop while loading the lidar folders! #280

amanysh99 opened this issue Feb 27, 2025 · 11 comments

Comments

@amanysh99
Copy link

During the execution of the code, after switching to CPU processing and selecting a subset of the dataset (specifically, the 'rr_dataset_23_11' folder), I encountered an issue where the program appears to enter an infinite loop while attempting to load 4545 lidar files from three folders. I am currently unable to determine the cause of this issue.

Image

@Kait0
Copy link
Collaborator

Kait0 commented Feb 27, 2025

Can you debug and report in which line of code the training enters an infinite loop?
Have not seen this before.

@amanysh99
Copy link
Author

Umm while debugging the code it enters an infinite loop here:

Image

@amanysh99
Copy link
Author

Here is the train.py and model.py after switching to CPU: https://github.com/amanysh99/Carla

@Kait0
Copy link
Collaborator

Kait0 commented Feb 27, 2025

hm? the main function is not very explicit.

maybe its the multiprocessing strategy.
You could try changing the fork method to
mp.set_start_method('spawn')
or
mp.set_start_method('forkserver')

fork is the default but can sometimes deadlock. You can try the other two and see if it helps.

@amanysh99
Copy link
Author

amanysh99 commented Feb 28, 2025

Unfortunately I had tried the two options sorrowfully all in veins :(

@Kait0
Copy link
Collaborator

Kait0 commented Feb 28, 2025

dataloader_train = DataLoader(train_set, sampler=sampler_train, batch_size=args.batch_size, worker_init_fn=seed_worker, generator=g_cuda, num_workers=8, pin_memory=True)

I would change the num workers in the dataloaders to 0.
That will turn off parallelism and helps you debug easier.
You can try debugging then again, you should get a better error message of where the code enters an infinite loop.

@amanysh99
Copy link
Author

It enters an infinite loop again without displaying any obvious error message!

@Kait0
Copy link
Collaborator

Kait0 commented Feb 28, 2025

and you can't debug?
If debugging doesn't work you can go the traditional way with placing print( statements everywhere to see where the code hangs.

@Kait0
Copy link
Collaborator

Kait0 commented Mar 1, 2025

aren't you mixing windows style paths with Linux style paths?

Also I would avoid putting spaces in paths, should generally work but sometimes not.

r'D:\Lab 6\transfuser\team_code_transfuser\data\rr_dataset_23_11'
should work python can handle windows paths.
Does the folder actually exist if you post this path in the explorer?

@amanysh99
Copy link
Author

Yes that exactly my fault that I mixed Windows and Linux style paths, but I solved and after checking the folder exist, but still enters an infinite loop:(

@amanysh99
Copy link
Author

Now I am just trying to debug the code on only one folder of data which is "lr_dataset_23_11" and just one scenario inside that folder which is "Routes_clipped_Town03_lr_Seed0" and root0000, so the path will be like this: "D:\Lab6\transfuser\team_code_transfuser\data\lr_dataset_23_11\Routes_clipped_Town03_lr_Seed0\Route_0000", so it can now loop over the "rgb, lidar, semantic, depth,....", and I am ensuring that the code read the path correctly, but it still enters infinite loop I am really can't figure why that happen?
Deepseek suggest that there is a conflict in the data structure!

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants