Cannot reproduce the results with pretrained demf model that is provided #8

guthasaibharathchandra · 2022-12-31T13:10:38Z

Hi, I have used the pre-trained model you have provided and the following script to evaluate the model on SUNRGBD and I get the following result which is different from the expected outcome of [email protected] and [email protected] as reported in the paper.
I'm using the demf with votenet backbone, and the script/command I have used is as follows: (single gpu test)

python eval.py configs/demf/demf_votenet.py pretrained_models/demf-epoch_36.pth --eval mAP

I have downloaded the provided demf-epoch_36.pth , could you please tell me if i'm missing something ? the state_dict of demf-epoch_36.pth contains the weights for entire model right ? i.e for img_backbone and pts_backbone both.
+-------------+---------+---------+---------+---------+
| classes | AP_0.25 | AR_0.25 | AP_0.50 | AR_0.50 |
+-------------+---------+---------+---------+---------+
| bed | 0.8614 | 0.9437 | 0.3699 | 0.5786 |
| table | 0.4739 | 0.8267 | 0.1377 | 0.3454 |
| sofa | 0.6361 | 0.8756 | 0.0765 | 0.3190 |
| chair | 0.8093 | 0.8994 | 0.6202 | 0.7177 |
| toilet | 0.9132 | 0.9862 | 0.3488 | 0.5517 |
| desk | 0.2490 | 0.7630 | 0.0271 | 0.2301 |
| dresser | 0.4240 | 0.8119 | 0.0467 | 0.2339 |
| night_stand | 0.6688 | 0.8902 | 0.4241 | 0.6392 |
| bookshelf | 0.1172 | 0.4539 | 0.0065 | 0.0603 |
| bathtub | 0.7807 | 0.8980 | 0.1729 | 0.4286 |
+-------------+---------+---------+---------+---------+
| Overall | 0.5934 | 0.8349 | 0.2230 | 0.4104 |
+-------------+---------+---------+---------+---------+

chenshi3 · 2023-01-21T05:20:35Z

I train the fcaf-based model, which can achieve the reported results. Maybe you can try fcaf-based model.

guthasaibharathchandra · 2023-02-03T13:57:55Z

Hi, mmdet3d later versions (i,e > 1.0.0), keeps all points in point clouds of sunrgbd dataset when processing. While the older mmdet3d versions sample only 50000 points. (you can see the NOTE in readme at https://github.com/open-mmlab/mmdetection3d/tree/master/data/sunrgbd) The above results I got were from the new version which keeps all the points. I evaluated on sunrgbd generated from old mmdet3d version and was able to reproduce the results you mentioned. I noticed however that in the train_pipeline you are sampling only 20000 points so ideally there shouldn't be much difference but seems like its not the case. May be worth investigating so i'm just sharing this here!

chenshi3 · 2023-02-05T13:47:56Z

The aixs of mmdet3d later versions (i,e > 1.0.0) is different from ours. You should be careful, and this may cause the problem.

guthasaibharathchandra · 2023-02-05T14:41:40Z

I'm only using the sunrgbd data generated using mmdet3d > 1.0.0 and using your specified versions of mmdet3d to run the pretrained model on it. I think sunrgbd point clouds are in depth co-ordinate system by default irrespective of mmdet3d versions. Am i missing something?

chenshi3 · 2023-02-05T15:31:01Z

In experiments, we generate the SUNRGB dataset with 100000 points. I don't think the number of points is main reason. As to the coordinate system, I check the code and have not found clues. I recommend you to use mmdet3d below version 1.0.

LIZECHUAN · 2023-11-06T08:25:46Z

I train the fcaf-based model, which can achieve the reported results. Maybe you can try fcaf-based model.

Hi , I train the fcaf-based model and I get the following result which is different from the reported mAp in the paper.could you please tell me if i'm missing something ?
+-------------+---------+---------+---------+---------+
| classes | AP_0.25 | AR_0.25 | AP_0.50 | AR_0.50 |
+-------------+---------+---------+---------+---------+
| bed | 0.8811 | 0.9767 | 0.6398 | 0.7359 |
| table | 0.4980 | 0.9059 | 0.2817 | 0.5988 |
| sofa | 0.7217 | 0.9490 | 0.5002 | 0.7161 |
| chair | 0.8150 | 0.9016 | 0.6704 | 0.7695 |
| toilet | 0.9287 | 0.9862 | 0.7106 | 0.8000 |
| desk | 0.3208 | 0.8379 | 0.0992 | 0.4299 |
| dresser | 0.4735 | 0.8991 | 0.2514 | 0.5963 |
| night_stand | 0.7013 | 0.9490 | 0.5532 | 0.7569 |
| bookshelf | 0.2982 | 0.7340 | 0.0587 | 0.2305 |
| bathtub | 0.8098 | 0.9592 | 0.4944 | 0.6939 |
+-------------+---------+---------+---------+---------+
| Overall | 0.6448 | 0.9099 | 0.4259 | 0.6328 |
+-------------+---------+---------+---------+---------+

n_points = 100000 dataset_type = 'SUNRGBDDataset' data_root = '/home/hy/ssd1/lzc/DeMF/sunrgbd/' class_names = ('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub') train_pipeline = [ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='LoadImageFromFile'), dict(type='LoadAnnotations3D'), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.0), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='LoadAnnotations', with_bbox=True), dict(type='IndoorPointSample', num_points=100000), dict(type='RandomFlip3D', sync_2d=False, flip_ratio_bev_horizontal=0.5), dict( type='GlobalRotScaleTrans', rot_range=[-0.523599, 0.523599], scale_ratio_range=[0.85, 1.15], translation_std=[0.1, 0.1, 0.1], shift_height=False), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub')), dict( type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d', 'img']) ] test_pipeline = [ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.0), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict( type='GlobalRotScaleTrans', rot_range=[0, 0], scale_ratio_range=[1.0, 1.0], translation_std=[0, 0, 0]), dict( type='RandomFlip3D', sync_2d=False, flip_ratio_bev_horizontal=0.5, flip_ratio_bev_vertical=0.5), dict(type='IndoorPointSample', num_points=100000), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['points', 'img']) ]) ] data = dict( samples_per_gpu=8, workers_per_gpu=4, train=dict( type='RepeatDataset', times=3, dataset=dict( type='SUNRGBDDataset', modality=dict(use_camera=True, use_lidar=True), data_root='/home/hy/ssd1/lzc/DeMF/sunrgbd/', ann_file='/home/hy/ssd1/lzc/DeMF/sunrgbd/sunrgbd_infos_train.pkl', pipeline=[ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='LoadImageFromFile'), dict(type='LoadAnnotations3D'), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.0), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='LoadAnnotations', with_bbox=True), dict(type='IndoorPointSample', num_points=100000), dict( type='RandomFlip3D', sync_2d=False, flip_ratio_bev_horizontal=0.5), dict( type='GlobalRotScaleTrans', rot_range=[-0.523599, 0.523599], scale_ratio_range=[0.85, 1.15], translation_std=[0.1, 0.1, 0.1], shift_height=False), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub')), dict( type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d', 'img']) ], filter_empty_gt=True, classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), box_type_3d='Depth')), val=dict( type='SUNRGBDDataset', modality=dict(use_camera=True, use_lidar=True), data_root='/home/hy/ssd1/lzc/DeMF/sunrgbd/', ann_file='/home/hy/ssd1/lzc/DeMF/sunrgbd/sunrgbd_infos_val.pkl', pipeline=[ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.0), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict( type='GlobalRotScaleTrans', rot_range=[0, 0], scale_ratio_range=[1.0, 1.0], translation_std=[0, 0, 0]), dict( type='RandomFlip3D', sync_2d=False, flip_ratio_bev_horizontal=0.5, flip_ratio_bev_vertical=0.5), dict(type='IndoorPointSample', num_points=100000), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['points', 'img']) ]) ], classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), test_mode=True, box_type_3d='Depth'), test=dict( type='SUNRGBDDataset', modality=dict(use_camera=True, use_lidar=True), data_root='/home/hy/ssd1/lzc/DeMF/sunrgbd/', ann_file='/home/hy/ssd1/lzc/DeMF/sunrgbd/sunrgbd_infos_val.pkl', pipeline=[ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.0), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict( type='GlobalRotScaleTrans', rot_range=[0, 0], scale_ratio_range=[1.0, 1.0], translation_std=[0, 0, 0]), dict( type='RandomFlip3D', sync_2d=False, flip_ratio_bev_horizontal=0.5, flip_ratio_bev_vertical=0.5), dict(type='IndoorPointSample', num_points=100000), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['points', 'img']) ]) ], classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), test_mode=True, box_type_3d='Depth')) voxel_size = 0.01 model = dict( type='TwoStageSparse3DDetector', voxel_size=0.01, backbone=dict(type='MEResNet3D', in_channels=3, depth=34), neck_with_head=dict( type='Fcaf3DNeckWithHead_my', in_channels=(64, 128, 256, 512), out_channels=128, pts_threshold=100000, n_classes=10, n_reg_outs=8, voxel_size=0.01, assigner=dict(type='Fcaf3DAssigner', limit=27, topk=18, n_scales=4), loss_bbox=dict(type='IoU3DLoss', loss_weight=1.0)), train_cfg=dict(), test_cfg=dict( nms_pre=1000, iou_thr=0.5, score_thr=0.01, ensemble_stages=[2]), img_encoder=dict( type='DeformableDetrEncoder', encoder=dict( type='DetrTransformerEncoder', num_layers=6, transformerlayers=dict( type='BaseTransformerLayer', attn_cfgs=dict( type='MultiScaleDeformableAttention', embed_dims=256), feedforward_channels=1024, ffn_dropout=0.1, operation_order=('self_attn', 'norm', 'ffn', 'norm'))), positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True, offset=-0.5), num_feature_levels=4, embed_dims=256), img_backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch'), img_neck=dict( type='ChannelMapper', in_channels=[512, 1024, 2048], kernel_size=1, out_channels=256, act_cfg=None, norm_cfg=dict(type='GN', num_groups=32), num_outs=4), stage2_head=dict( type='CAHeadIter', decoder=dict( type='TransformerDecoderLayerWithPos', num_layers=1, transformerlayers=dict( type='DetrTransformerDecoderLayer', attn_cfgs=[ dict( type='MultiheadAttention', embed_dims=256, num_heads=8, dropout=0.1), dict(type='MultiScaleDeformableAttention', embed_dims=256) ], feedforward_channels=1024, ffn_dropout=0.1, operation_order=('self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm')), posembed=dict(input_channel=9, num_pos_feats=256))), freeze_img_branch=True) find_unused_parameters = True optimizer = dict( type='AdamW', lr=0.001, weight_decay=0.0001, paramwise_cfg=dict( custom_keys=dict(decoder=dict(lr_mult=0.05, decay_mult=1.0)))) optimizer_config = dict(grad_clip=dict(max_norm=10, norm_type=2)) lr_config = dict(policy='step', warmup=None, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) custom_hooks = [dict(type='EmptyCacheHook', after_iter=True)] checkpoint_config = dict(interval=1, max_keep_ckpts=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = '1105/raw/base' load_from = '/home/hy/ssd1/lzc/DeMF/deform_detr-epoch_10.pth' resume_from = None workflow = [('train', 1)] lr = 0.001 img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) evaluation = dict(interval=1) gpu_ids = range(0, 4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce the results with pretrained demf model that is provided #8

Cannot reproduce the results with pretrained demf model that is provided #8

guthasaibharathchandra commented Dec 31, 2022 •

edited

Loading

chenshi3 commented Jan 21, 2023

guthasaibharathchandra commented Feb 3, 2023 •

edited

Loading

chenshi3 commented Feb 5, 2023

guthasaibharathchandra commented Feb 5, 2023

chenshi3 commented Feb 5, 2023 •

edited

Loading

LIZECHUAN commented Nov 6, 2023

Cannot reproduce the results with pretrained demf model that is provided #8

Cannot reproduce the results with pretrained demf model that is provided #8

Comments

guthasaibharathchandra commented Dec 31, 2022 • edited Loading

chenshi3 commented Jan 21, 2023

guthasaibharathchandra commented Feb 3, 2023 • edited Loading

chenshi3 commented Feb 5, 2023

guthasaibharathchandra commented Feb 5, 2023

chenshi3 commented Feb 5, 2023 • edited Loading

LIZECHUAN commented Nov 6, 2023

guthasaibharathchandra commented Dec 31, 2022 •

edited

Loading

guthasaibharathchandra commented Feb 3, 2023 •

edited

Loading

chenshi3 commented Feb 5, 2023 •

edited

Loading