Custom Image Preprocessing #126

soobin508 · 2024-01-30T13:51:35Z

Hi, I need help in implementing my own image preprocessing techniques before training and inference. For training, I only found the "get_train_aug" consists of Albumentation, and it directly passes to "create_train_dataset". However, my techniques will be in cv2 format.

Please give me advice on where I can modify the codes. TQ!

My sample techniques will be like this:
img = cv2.imread(os.path.join(img_dir, image))
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
_channel, a, b = cv2.split(lab)

sovit-123 · 2024-01-31T10:55:07Z

Hi. The best place to make these changes would be in the datasets.py file. Also, none of the augmentations will be applied by default unless you pass the --use-train-aug argument to the train.py file.

soobin508 · 2024-01-31T16:29:56Z

okay thank you! I have another question. If I want to achieve high accuracy and fps for traffic sign recognition, which pretrained layer is the best? Based on my observation, resnet50 achieves high accuracy and the fps is not as expected.

sovit-123 · 2024-02-01T01:40:37Z

You can try Faster RCNN MobileNet V3 model.

soobin508 · 2024-02-01T07:53:31Z

okay thanks! by any chance, do you know what's wrong with this? i checked online, they mentioned that its sth related to the gradient/lr and optimizer.

842 Epoch: [6] [2700/7446] eta: 0:08:01 lr: 0.001000 loss: 0.2247 (0.2497) loss_classifier: 0.0683 (0.0744) loss_box_reg: 0.1534 (0.1739) loss_objectness: 0.0001 (0.0002) loss_rpn_box_reg: 0.0011 (0.0013) time: 0.0961 data: 0.0086 max mem: 1620
843 Epoch: [6] [2800/7446] eta: 0:07:50 lr: 0.001000 loss: 0.2471 (0.2496) loss_classifier: 0.0691 (0.0745) loss_box_reg: 0.1500 (0.1736) loss_objectness: 0.0001 (0.0002) loss_rpn_box_reg: 0.0012 (0.0013) time: 0.0953 data: 0.0031 max mem: 1620
844 Loss is nan, stopping training
845 {'loss_classifier': tensor(nan, device='cuda:0', grad_fn=), 'loss_box_reg': tensor(nan, device='cuda:0', grad_fn=), 'loss_objectness': tensor(nan, device='cuda:0', grad_fn=), 'loss_rpn_box_reg': tensor(nan, device='cuda:0', grad_fn=)}

sovit-123 · 2024-02-01T09:35:27Z

Try lowering the initial learning rate to 0.00001

soobin508 · 2024-02-01T12:00:12Z

i tried with new lr, it shows the same error..

845 Epoch: [6] [2800/7446] eta: 0:07:49 lr: 0.000010 loss: 0.3495 (0.4442) loss_classifier: 0.1362 (0.2124) loss_box_reg: 0.2011 (0.2273) loss_objectness: 0.0000 (0.0003) loss_rpn_box_reg: 0.0039 (0.0043) time: 0.0961 data: 0.0044 max mem: 1622
846 Loss is nan, stopping training
847 {'loss_classifier': tensor(nan, device='cuda:0', grad_fn=), 'loss_box_reg': tensor(nan, device='cuda:0', grad_fn=), 'loss_objectness': tensor(nan, device='cuda:0', grad_fn=), 'loss_rpn_box_reg': tensor(nan, device='cuda:0', grad_fn=)}

sovit-123 · 2024-02-02T00:29:32Z

Hi, this generally does not happen. This is one of the first times I have seen this issue. Were you able to solve the issue?

soobin508 · 2024-02-02T02:49:06Z

Unfortunately no, but this error only happens with mobilenet. I tried w other layers with the same parameters, they all working fine.

sovit-123 · 2024-02-02T03:19:38Z

Ok. May I know your PyTorch version. I will try to debug deeper this weekend. Also, if possible is there a link to your dataset that I can use for debugging?

soobin508 · 2024-02-02T07:58:20Z

my pytorch version is '2.1.0' and torchvision is 0.16.0. the dataset used is the GTSRB. I create the dataset by using the create xml u used your previous example.

soobin508 · 2024-02-07T03:34:38Z

hi, may I ask what kinds of hardware u used to run the inference? I have tried NVIDIA Jetson Xavier to run the onnx inference code, the FPS only reaches max of 7. The pretrained layer I used is mini darkent with nano head. Also, I used cv2.VideoCapture() for live detection, is that the factor affecting the fps?

sovit-123 · 2024-02-07T06:48:33Z

I compute the FPS on the forward pass only. So, the cv2 functions won't affect it. Faster RCNN, although accurate, struggles a bit to give high FPS on edge devices. I am trying to optimize the pipeline even more.

soobin508 · 2024-02-07T14:22:55Z

alright thank you! Can I ask how to improve the accuracy of the training? Currently I tested out on GTSDB with mobilenet but the accuracy only 23 after 250 epochs...

sovit-123 · 2024-02-07T17:20:34Z

I think you can use the Faster RCNN ResNet50 FPN V2 model. It works very well for small objects.

soobin508 · 2024-02-19T16:10:29Z

hi i tried with resnet fpn v2, the best mAP only reaches 51% @0.5:0.95 & 65% @0.5. May I know what else I can do to increase the accuracy? So far I tried with applying the image preprocessing techniques, but the outcome is not that significant.

sovit-123 · 2024-02-19T16:27:06Z

May I know the dataset.

soobin508 · 2024-02-19T23:50:00Z

gtsdb

sovit-123 · 2024-02-20T00:37:54Z

Have you tried with the default code in the repository without additional changes to the image processing techniques? If so, can you please let me know the result?

Also, for GTSDB, training with higher resolution images will help a lot.

soobin508 · 2024-02-20T07:00:01Z

yes, I tried with the default code without any additional change, the best result is 51% @0.5:0.95 & 65% @0.5. My aim is to reach accuracy of 90%+, is it possible?

Does it mean that I have to modify the resolution of the GTSDB using online tools? I just downloaded from website by default.

sovit-123 · 2024-02-20T09:03:55Z

You use the --imgsz command line argument to control the image size.

soobin508 · 2024-02-20T09:54:13Z

besides --imgsz, is there other factor affecting the accuracy? ><

sovit-123 · 2024-02-20T11:04:34Z

There are a lot actually. I will recommend going through all the command line arguments in the train.py file. Especially, the ones that handle arguments.
--mosaic
--use-train-aug

soobin508 · 2024-02-26T10:39:56Z

Hi Sovit. I would like to implement F1-score and recall as the metrics. May I ask where I can make the modification?

sovit-123 · 2024-02-26T11:32:40Z

I think creating a separate metrics.py inside the utils directory will be the best approach.

soobin508 · 2024-02-26T13:19:39Z

does it mean that the metrics.py have to replace the evaluate in the train.py?

sovit-123 · 2024-02-26T15:33:04Z

Oh, if you wish to add that to the evaluation part after training only, then you can directly modify the eval.py script in the root directory.

soobin508 · 2024-02-27T14:03:24Z

I tried to add torchmetrics Precision inside the eval.py. However, there are many errors associated to it.

I added prec=Precision(preds, target). It said AttributeError: 'list' object has no attribute 'replace'
I made a list for "labels" then convert back to Tensor. but it said AttributeError: 'Tensor' object has no attribute 'replace'

May I know how to solve this issue?

sovit-123 · 2024-02-28T00:32:57Z

It is basically saying that the list and tensor data types do not have a replace attribute. I think string data types have replace but that may not be an idea data type for solving this.

soobin508 · 2024-02-28T01:07:03Z

What will be the best way to include the metrics? I'm using the same target and preds u have been using....

sovit-123 · 2024-02-28T10:34:01Z

I need to take a look how to structure the additional changes.

soobin508 · 2024-03-10T03:15:05Z

Hi Sovit, is it possible to train and evaluate a model with multiple datasets? I would like to test their consistency.

sovit-123 · 2024-03-10T07:43:29Z

Hi. Can you please explain what you mean by multiple datasets?

soobin508 · 2024-03-10T07:50:23Z

for example like GTSRB CTSRD, so that it can detect more types of traffic signs

sovit-123 · 2024-03-10T07:54:16Z

So, if you can keep the images and labels in the same directories and adjust the YAML file to have the correct classes, then it is certainly possible.

Najamulhassan3383 · 2024-03-27T17:21:50Z

Hello! @sovit-123
I am new to deep learning, so sorry if my question are too basic but these are important for me. I want to ask how to handle imbalance in data for object detection, 2nd can i pass class weights to loss function or can use focal loss for 4 class problem in object detection.?

sovit-123 · 2024-03-28T04:05:48Z

@Najamulhassan3383
Right now, the code base does not support applying weighted loss functions. For object detection, if your dataset is imbalanced, it is highly recommended that you gather mode samples for the classes that is under representd.

Custom Image Preprocessing #126

Custom Image Preprocessing #126

Comments

soobin508 commented Jan 30, 2024

sovit-123 commented Jan 31, 2024

soobin508 commented Jan 31, 2024

sovit-123 commented Feb 1, 2024

soobin508 commented Feb 1, 2024

sovit-123 commented Feb 1, 2024

soobin508 commented Feb 1, 2024

sovit-123 commented Feb 2, 2024

soobin508 commented Feb 2, 2024

sovit-123 commented Feb 2, 2024

soobin508 commented Feb 2, 2024

soobin508 commented Feb 7, 2024

sovit-123 commented Feb 7, 2024

soobin508 commented Feb 7, 2024

sovit-123 commented Feb 7, 2024

soobin508 commented Feb 19, 2024

sovit-123 commented Feb 19, 2024

soobin508 commented Feb 19, 2024

sovit-123 commented Feb 20, 2024

soobin508 commented Feb 20, 2024

sovit-123 commented Feb 20, 2024

soobin508 commented Feb 20, 2024

sovit-123 commented Feb 20, 2024

soobin508 commented Feb 26, 2024

sovit-123 commented Feb 26, 2024

soobin508 commented Feb 26, 2024

sovit-123 commented Feb 26, 2024

soobin508 commented Feb 27, 2024

sovit-123 commented Feb 28, 2024

soobin508 commented Feb 28, 2024

sovit-123 commented Feb 28, 2024

soobin508 commented Mar 10, 2024

sovit-123 commented Mar 10, 2024

soobin508 commented Mar 10, 2024

sovit-123 commented Mar 10, 2024

Najamulhassan3383 commented Mar 27, 2024 • edited Loading

sovit-123 commented Mar 28, 2024

Najamulhassan3383 commented Mar 27, 2024 •

edited

Loading