DSVT-P trainning on Kitti Dataset #64

dinvincible98 · 2023-10-23T18:06:41Z

Hi,

I tried to train a dsvt-pillar model using the kitti dataset, below is my config:

CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']

DATA_CONFIG: 
    _BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
    POINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1]

    DATA_AUGMENTOR:
        DISABLE_AUG_LIST: ['placeholder']
        AUG_CONFIG_LIST:
            - NAME: gt_sampling
              USE_ROAD_PLANE: True
              DB_INFO_PATH:
                  - kitti_dbinfos_train.pkl
              PREPARE: {
                 filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
                 filter_by_difficulty: [-1],
              }

              SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
              NUM_POINT_FEATURES: 4
              REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
              LIMIT_WHOLE_SCENE: True

            - NAME: random_world_flip
              ALONG_AXIS_LIST: ['x','y']

            - NAME: random_world_rotation
              WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

            - NAME: random_world_scaling
              WORLD_SCALE_RANGE: [0.95, 1.05]
            - NAME: random_world_translation
              NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
    DATA_PROCESSOR:
      -   NAME: mask_points_and_boxes_outside_range
          REMOVE_OUTSIDE_BOXES: True
      -   NAME: shuffle_points
          SHUFFLE_ENABLED: {
            'train': True,
            'test': False
          }
      -   NAME: transform_points_to_voxels_placeholder
          VOXEL_SIZE: [ 0.1505, 0.1709, 4 ]

MODEL:
  NAME: CenterPoint

  VFE:
    NAME: DynPillarVFE3D
    WITH_DISTANCE: False
    USE_ABSLOTE_XYZ: True
    USE_NORM: True
    NUM_FILTERS: [192, 192]

  BACKBONE_3D:
    NAME: DSVT
    INPUT_LAYER:
      sparse_shape: [468, 468, 1]
      downsample_stride: []
      d_model: [192]
      set_info: [[36, 4]]
      window_shape: [[12, 12, 1]]
      hybrid_factor: [2, 2, 1] # x, y, z
      shifts_list: [[[0, 0, 0], [6, 6, 0]]]
      normalize_pos: False
    
    block_name: ['DSVTBlock']
    set_info: [[36, 4]]
    d_model: [192]
    nhead: [8]
    dim_feedforward: [384]
    
    
    dropout: 0.0 
    activation: gelu
    reduction_type: 'attention'
    output_shape: [468, 468]
    conv_out_channel: 192
    # ues_checkpoint: True

  MAP_TO_BEV:
    NAME: PointPillarScatter3d
    INPUT_SHAPE: [468, 468, 1]
    NUM_BEV_FEATURES: 192

  BACKBONE_2D:
    NAME: BaseBEVResBackbone
    LAYER_NUMS: [ 1, 2, 2 ]
    LAYER_STRIDES: [ 1, 2, 2 ]
    NUM_FILTERS: [ 128, 128, 256 ]
    UPSAMPLE_STRIDES: [ 1, 2, 4 ]
    NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]

  DENSE_HEAD:
    NAME: CenterHead
    CLASS_AGNOSTIC: False

    CLASS_NAMES_EACH_HEAD: [
      ['Car', 'Pedestrian', 'Cyclist']
    ]

    SHARED_CONV_CHANNEL: 64
    USE_BIAS_BEFORE_NORM: False
    NUM_HM_CONV: 2

    BN_EPS: 0.001
    BN_MOM: 0.01
    SEPARATE_HEAD_CFG:
      HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
      HEAD_DICT: {
        'center': {'out_channels': 2, 'num_conv': 2},
        'center_z': {'out_channels': 1, 'num_conv': 2},
        'dim': {'out_channels': 3, 'num_conv': 2},
        'rot': {'out_channels': 2, 'num_conv': 2},
        'iou': {'out_channels': 1, 'num_conv': 2},
      }

    TARGET_ASSIGNER_CONFIG:
      FEATURE_MAP_STRIDE: 1
      NUM_MAX_OBJS: 500
      GAUSSIAN_OVERLAP: 0.1
      MIN_RADIUS: 2

    IOU_REG_LOSS: True

    LOSS_CONFIG:
      LOSS_WEIGHTS: {
        'cls_weight': 1.0,
        'loc_weight': 2.0,
        'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
      }

    POST_PROCESSING:
      SCORE_THRESH: 0.5
      POST_CENTER_LIMIT_RANGE: [-80, -80, -10.0, 80, 80, 10.0]
      MAX_OBJ_PER_SAMPLE: 500

      USE_IOU_TO_RECTIFY_SCORE: True
      IOU_RECTIFIER: [0.68, 0.71, 0.65]


      NMS_CONFIG:
        # NMS_TYPE: multi_class_nms  # only for centerhead， use mmdet3d version nms
        # NMS_THRESH: [0.7, 0.6, 0.55]
        # NMS_PRE_MAXSIZE: [4096, 4096, 4096]
        # NMS_POST_MAXSIZE: [500, 500, 500]
        
        NMS_TYPE: nms_gpu 
        NMS_THRESH: 0.1
        NMS_PRE_MAXSIZE: 4096
        NMS_POST_MAXSIZE: 500

  POST_PROCESSING:
    RECALL_THRESH_LIST: [0.3, 0.5, 0.7]

    EVAL_METRIC: kitti

OPTIMIZATION:
    BATCH_SIZE_PER_GPU: 1
    NUM_EPOCHS: 20

    OPTIMIZER: adam_onecycle
    LR: 0.001
    WEIGHT_DECAY: 0.01
    MOMENTUM: 0.9

    MOMS: [0.95, 0.85]
    PCT_START: 0.4
    DIV_FACTOR: 10
    DECAY_STEP_LIST: [35, 45]
    LR_DECAY: 0.1
    LR_CLIP: 0.0000001

    LR_WARMUP: False
    WARMUP_EPOCH: 1
    
    GRAD_NORM_CLIP: 10
    LOSS_SCALE_FP16: 32.0

HOOK:
  DisableAugmentationHook:
    DISABLE_AUG_LIST: ['gt_sampling','random_world_flip','random_world_rotation','random_world_scaling', 'random_world_translation']
    NUM_LAST_EPOCHS: 1

I only modified the point cloud range to match with the kitti settings and the voxel size to match with the default sparce shape [468, 468, 1], but I am constantly getting an error:

RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

I traced down the error happened in DynamicPillarVFE3D module where the batch_dict['points'] often return some empty tensor point. However, when I tried to use the default point cloud range from waymo settings: [-74.88, -74.88, -2, 74.88, 74.88, 4.0], this error disappered. Can u give me some guidance?

Thank you!

The text was updated successfully, but these errors were encountered:

xifen523 · 2023-10-31T16:05:13Z

Did you train successfully on the kitti? How did it turn out?

dinvincible98 · 2023-10-31T17:31:52Z

Did you train successfully on the kitti? How did it turn out?

No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.

xifen523 · 2023-11-01T14:46:16Z

Did you train successfully on the kitti? How did it turn out?

No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.

I will try to run this code on the KITTI in the future when I have some free time.

Haiyang-W · 2023-11-01T17:12:37Z

Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.

dinvincible98 · 2023-11-01T17:42:34Z

Did you train successfully on the kitti? How did it turn out?

No, I always get Nan or Inf error during trainning. I guess there are some hyperparameter issues.

I will try to run this code on the KITTI in the future when I have some free time.

There are some data augumentor issues, here's the modified config, you can try this to see if get Nan or Inf error:

    CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']
    
    DATA_CONFIG: 
        _BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
        POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
        DATA_PROCESSOR:
            - NAME: mask_points_and_boxes_outside_range
              REMOVE_OUTSIDE_BOXES: True
    
            - NAME: shuffle_points
              SHUFFLE_ENABLED: {
                'train': True,
                'test': False
              }
    
            - NAME: transform_points_to_voxels
              VOXEL_SIZE: [0.1477, 0.1696, 4]
              MAX_POINTS_PER_VOXEL: 32
              MAX_NUMBER_OF_VOXELS: {
                'train': 16000,
                'test': 40000
              }
        DATA_AUGMENTOR:
            DISABLE_AUG_LIST: ['placeholder']
            AUG_CONFIG_LIST:
                - NAME: gt_sampling
                  USE_ROAD_PLANE: True
                  DB_INFO_PATH:
                      - kitti_dbinfos_train.pkl
                  PREPARE: {
                     filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
                     filter_by_difficulty: [-1],
                  }
    
                  SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
                  NUM_POINT_FEATURES: 4
                  DATABASE_WITH_FAKELIDAR: False
                  REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
                  LIMIT_WHOLE_SCENE: False
    
                - NAME: random_world_flip
                  ALONG_AXIS_LIST: ['x']
    
                - NAME: random_world_rotation
                  WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
    
                - NAME: random_world_scaling
                  WORLD_SCALE_RANGE: [0.95, 1.05]
    MODEL:
      NAME: CenterPoint
    
      VFE:
        NAME: DynPillarVFE3D
        WITH_DISTANCE: False
        USE_ABSLOTE_XYZ: True
        USE_NORM: True
        NUM_FILTERS: [ 192, 192 ]
    
      BACKBONE_3D:
        NAME: DSVT
        INPUT_LAYER:
          sparse_shape: [468, 468, 1]
          downsample_stride: []
          d_model: [192]
          set_info: [[36, 4]]
          window_shape: [[12, 12, 1]]
          hybrid_factor: [2, 2, 1] # x, y, z
          shifts_list: [[[0, 0, 0], [6, 6, 0]]]
          normalize_pos: False
    
        block_name: ['DSVTBlock']
        set_info: [[36, 4]]
        d_model: [192]
        nhead: [8]
        dim_feedforward: [384]
        dropout: 0.0
        activation: gelu
        output_shape: [468, 468]
        conv_out_channel: 192
        # ues_checkpoint: True
    
      MAP_TO_BEV:
        NAME: PointPillarScatter3d
        INPUT_SHAPE: [468, 468, 1]
        NUM_BEV_FEATURES: 192
    
      BACKBONE_2D:
        NAME: BaseBEVResBackbone
        LAYER_NUMS: [ 1, 2, 2 ]
        LAYER_STRIDES: [ 1, 2, 2 ]
        NUM_FILTERS: [ 128, 128, 256 ]
        UPSAMPLE_STRIDES: [ 1, 2, 4 ]
        NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]
    
      DENSE_HEAD:
        NAME: CenterHead
        CLASS_AGNOSTIC: False
    
        CLASS_NAMES_EACH_HEAD: [
          ['Car', 'Pedestrian', 'Cyclist']
        ]
    
        SHARED_CONV_CHANNEL: 64
        USE_BIAS_BEFORE_NORM: True
        NUM_HM_CONV: 2
    
        BN_EPS: 0.001
        BN_MOM: 0.01
        SEPARATE_HEAD_CFG:
          HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
          HEAD_DICT: {
            'center': {'out_channels': 2, 'num_conv': 2},
            'center_z': {'out_channels': 1, 'num_conv': 2},
            'dim': {'out_channels': 3, 'num_conv': 2},
            'rot': {'out_channels': 2, 'num_conv': 2},
            'iou': {'out_channels': 1, 'num_conv': 2},
          }
    
        TARGET_ASSIGNER_CONFIG:
          FEATURE_MAP_STRIDE: 1
          NUM_MAX_OBJS: 500
          GAUSSIAN_OVERLAP: 0.1
          MIN_RADIUS: 2
    
        IOU_REG_LOSS: True
    
        LOSS_CONFIG:
          LOSS_WEIGHTS: {
            'cls_weight': 1.0,
            'loc_weight': 2.0,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
          }
    
    
    
        POST_PROCESSING:
            RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
            SCORE_THRESH: 0.1
            OUTPUT_RAW_SCORE: False
            POST_CENTER_LIMIT_RANGE: [0, -40, -3, 75, 40, 1]
            MAX_OBJ_PER_SAMPLE: 500
    
            EVAL_METRIC: kitti
    
            NMS_CONFIG:
                MULTI_CLASSES_NMS: False
                NMS_TYPE: nms_gpu
                NMS_THRESH: 0.01
                NMS_PRE_MAXSIZE: 4096
                NMS_POST_MAXSIZE: 500
    
    
    OPTIMIZATION:
        BATCH_SIZE_PER_GPU: 1
        NUM_EPOCHS: 40
    
        OPTIMIZER: adam_onecycle
        LR: 0.001
        WEIGHT_DECAY: 0.01
        MOMENTUM: 0.9
    
        MOMS: [0.95, 0.85]
        PCT_START: 0.4
        DIV_FACTOR: 10
        DECAY_STEP_LIST: [35, 45]
        LR_DECAY: 0.1
        LR_CLIP: 0.0000001
    
        LR_WARMUP: False
        WARMUP_EPOCH: 1
    
        GRAD_NORM_CLIP: 10

dinvincible98 · 2023-11-01T17:46:56Z

Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.

Yes, I checked this issue so I recalculate the voxel size. The sparse shape matched with pillar settings but the training will throw Nan or Inf error after multiple epochs

evil-master · 2023-11-02T04:11:51Z

很抱歉回复晚了，我正在赶一些ddls。我们还没有尝试过kitti数据集。您可以查看 issue59 是否有帮助。

是的，我检查了这个问题，所以我重新计算了体素大小。稀疏形状与柱子设置匹配，但训练会在多个 epoch 后抛出 Nan 或 Inf 错误
May I ask if you can adapt to the Kitti dataset by simply modifying the config file without modifying the network? Besides, why is your backbone_ 3D Don't need downsampling

Haiyang-W · 2023-11-02T05:08:32Z

If anyone has succeeded on KITTI by modifying the config, please share the corresponding config and experimental results in this issue. We will be very grateful for your contribution to the community. :)

After I finish the CVPR deadline, I'll take a look when I have time. I guess this shouldn't be a very difficult problem.

Haiyang-W · 2023-11-02T05:11:02Z

很抱歉回复晚了，我正在赶一些ddls。我们还没有尝试过kitti数据集。您可以查看 issue59 是否有帮助。

是的，我检查了这个问题，所以我重新计算了体素大小。稀疏形状与柱子设置匹配，但训练会在多个 epoch 后抛出 Nan 或 Inf 错误
May I ask if you can adapt to the Kitti dataset by simply modifying the config file without modifying the network? Besides, why is your backbone_ 3D Don't need downsampling

I guess he use DSVT-pillar version.

123susu · 2023-11-07T07:59:14Z

Very sorry for the late reply, I'm rushing some ddls. We haven't tried kitti dataset. You can see if issue59 will be helpful.

Yes, I checked this issue so I recalculate the voxel size. The sparse shape matched with pillar settings but the training will throw Nan or Inf error after multiple epochs

did you complete kitti config? l really need this,thanks!

Haiyang-W · 2023-12-08T13:01:32Z

Any update?

dinvincible98 · 2023-12-08T19:19:43Z

I have a functional config for training kitti dataset:

CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']
DATA_CONFIG: 
_BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]

DATA_PROCESSOR:
    - NAME: mask_points_and_boxes_outside_range
      REMOVE_OUTSIDE_BOXES: True

    - NAME: shuffle_points
      SHUFFLE_ENABLED: {
        'train': True,
        'test': False
      }

    - NAME: transform_points_to_voxels_placeholder
      VOXEL_SIZE: [0.1477, 0.1696, 4]
      MAX_POINTS_PER_VOXEL: 32
      MAX_NUMBER_OF_VOXELS: {
       'train': 16000,
       'test': 40000
      }

DATA_AUGMENTOR:
    DISABLE_AUG_LIST: ['placeholder']
    AUG_CONFIG_LIST:
        - NAME: gt_sampling
          USE_ROAD_PLANE: True
          DB_INFO_PATH:
              - kitti_dbinfos_train.pkl
          PREPARE: {
             filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
             filter_by_difficulty: [-1],
          }

          SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
          NUM_POINT_FEATURES: 4
          DATABASE_WITH_FAKELIDAR: False
          REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
          LIMIT_WHOLE_SCENE: False

        - NAME: random_world_flip
          ALONG_AXIS_LIST: ['x']

        - NAME: random_world_rotation
          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

        - NAME: random_world_scaling
          WORLD_SCALE_RANGE: [0.95, 1.05]
        
        - NAME: random_local_pyramid_aug
          DROP_PROB: 0.25
          SPARSIFY_PROB: 0.05
          SPARSIFY_MAX_NUM: 50
          SWAP_PROB: 0.1
          SWAP_MAX_NUM: 50
MODEL:
NAME: CenterPoint

VFE:
NAME: DynPillarVFE3D
WITH_DISTANCE: False
USE_ABSLOTE_XYZ: True
USE_NORM: True
NUM_FILTERS: [ 192, 192 ]

BACKBONE_3D:
NAME: DSVT
INPUT_LAYER:
  sparse_shape: [468, 468, 1]
  downsample_stride: []
  d_model: [192]
  set_info: [[36, 4]]
  window_shape: [[12, 12, 1]]
  hybrid_factor: [2, 2, 1] # x, y, z
  shifts_list: [[[0, 0, 0], [6, 6, 0]]]
  normalize_pos: False

block_name: ['DSVTBlock']
set_info: [[36, 4]]
d_model: [192]
nhead: [8]
dim_feedforward: [384]
dropout: 0.0
activation: gelu
output_shape: [468, 468]
conv_out_channel: 192
ues_checkpoint: True

MAP_TO_BEV:
NAME: PointPillarScatter3d
INPUT_SHAPE: [468, 468, 1]
NUM_BEV_FEATURES: 192

BACKBONE_2D:
NAME: BaseBEVResBackbone
LAYER_NUMS: [ 1, 2, 2 ]
LAYER_STRIDES: [ 1, 2, 2 ]
NUM_FILTERS: [ 128, 128, 256 ]
UPSAMPLE_STRIDES: [ 1, 2, 4 ]
NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]

DENSE_HEAD:
NAME: CenterHead
CLASS_AGNOSTIC: False

CLASS_NAMES_EACH_HEAD: [
  ['Car', 'Pedestrian', 'Cyclist']
]

SHARED_CONV_CHANNEL: 64
USE_BIAS_BEFORE_NORM: False
NUM_HM_CONV: 2

BN_EPS: 0.001
BN_MOM: 0.01
SEPARATE_HEAD_CFG:
  HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
  HEAD_DICT: {
    'center': {'out_channels': 2, 'num_conv': 2},
    'center_z': {'out_channels': 1, 'num_conv': 2},
    'dim': {'out_channels': 3, 'num_conv': 2},
    'rot': {'out_channels': 2, 'num_conv': 2},
    'iou': {'out_channels': 1, 'num_conv': 2},
  }

TARGET_ASSIGNER_CONFIG:
  FEATURE_MAP_STRIDE: 1
  NUM_MAX_OBJS: 500
  GAUSSIAN_OVERLAP: 0.1
  MIN_RADIUS: 2
  # BOX_CODER: ResidualCoder

IOU_REG_LOSS: True

LOSS_CONFIG:
  LOSS_WEIGHTS: {
    'cls_weight': 1.0,
    'loc_weight': 2.0,
    'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
  }



POST_PROCESSING:
  # RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
  SCORE_THRESH: 0.1
  OUTPUT_RAW_SCORE: False
  POST_CENTER_LIMIT_RANGE: [0, -40, -3, 80, 40, 1]
  MAX_OBJ_PER_SAMPLE: 500

  # USE_IOU_TO_RECTIFY_SCORE: True
  # IOU_RECTIFIER: [0.5, 0.71, 0.65]

  NMS_CONFIG:
    MULTI_CLASSES_NMS: False
    NMS_TYPE: nms_gpu
    NMS_THRESH: 0.01
    NMS_PRE_MAXSIZE: 4096
    NMS_POST_MAXSIZE: 500

POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]

EVAL_METRIC: kitti


OPTIMIZATION:
BATCH_SIZE_PER_GPU: 2
NUM_EPOCHS: 80

OPTIMIZER: adam_onecycle
LR: 0.001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9

MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001

LR_WARMUP: False
WARMUP_EPOCH: 1

GRAD_NORM_CLIP: 10
LOSS_SCALE_FP16: 32.0

And I got results below:

Generate label finished(sec_per_example: 0.0629 second).
recall_roi_0.3: 0.000000
recall_rcnn_0.3: 0.939800
recall_roi_0.5: 0.000000
recall_rcnn_0.5: 0.888598
recall_roi_0.7: 0.000000
recall_rcnn_0.7: 0.669268
Average predicted number of objects(3769 samples): 12.283

Car [email protected], 0.70, 0.70:
bbox AP:95.2198, 89.4526, 88.8744
bev  AP:89.3524, 87.4650, 86.4434
3d   AP:87.1649, 77.6970, 76.9884
aos  AP:95.20, 89.33, 88.69
Car [email protected], 0.70, 0.70:
bbox AP:97.0173, 94.1119, 91.8572
bev  AP:92.0481, 88.3356, 87.8141
3d   AP:87.8954, 80.9305, 78.6656
aos  AP:97.00, 93.96, 91.65
Car [email protected], 0.50, 0.50:
bbox AP:95.2198, 89.4526, 88.8744
bev  AP:95.2554, 89.6240, 89.1787
3d   AP:95.1996, 89.5775, 89.0910
aos  AP:95.20, 89.33, 88.69
Car [email protected], 0.50, 0.50:
bbox AP:97.0173, 94.1119, 91.8572
bev  AP:97.2790, 94.6162, 94.2047
3d   AP:97.2439, 94.5121, 94.0162
aos  AP:97.00, 93.96, 91.65
Pedestrian [email protected], 0.50, 0.50:
bbox AP:68.9796, 66.8907, 64.9734
bev  AP:58.0059, 55.0643, 52.5829
3d   AP:52.8691, 51.4204, 47.9159
aos  AP:64.75, 62.16, 59.99
Pedestrian [email protected], 0.50, 0.50:
bbox AP:69.8867, 67.2526, 64.8893
bev  AP:56.3069, 53.6334, 50.8015
3d   AP:51.9532, 49.1637, 45.9747
aos  AP:65.05, 62.00, 59.43
Pedestrian [email protected], 0.25, 0.25:
bbox AP:68.9796, 66.8907, 64.9734
bev  AP:75.4927, 73.8959, 71.9835
3d   AP:74.6756, 72.9769, 71.1328
aos  AP:64.75, 62.16, 59.99
Pedestrian [email protected], 0.25, 0.25:
bbox AP:69.8867, 67.2526, 64.8893
bev  AP:76.3137, 74.6665, 72.3919
3d   AP:75.3678, 73.5785, 71.5206
aos  AP:65.05, 62.00, 59.43
Cyclist [email protected], 0.50, 0.50:
bbox AP:88.9667, 77.5716, 74.3384
bev  AP:86.9765, 71.5262, 67.4459
3d   AP:85.9338, 69.3215, 66.2503
aos  AP:88.85, 77.08, 73.75
Cyclist [email protected], 0.50, 0.50:
bbox AP:93.4071, 78.7487, 75.0949
bev  AP:91.3305, 71.9404, 67.9715
3d   AP:88.3232, 69.5222, 66.1419
aos  AP:93.27, 78.19, 74.49
Cyclist [email protected], 0.25, 0.25:
bbox AP:88.9667, 77.5716, 74.3384
bev  AP:87.2510, 74.5043, 70.9506
3d   AP:87.2510, 74.5037, 70.9506
aos  AP:88.85, 77.08, 73.75
Cyclist [email protected], 0.25, 0.25:
bbox AP:93.4071, 78.7487, 75.0949
bev  AP:91.5098, 75.4892, 71.6936
3d   AP:91.5098, 75.4891, 71.6934
aos  AP:93.27, 78.19, 74.49

Haiyang-W · 2023-12-08T19:55:45Z

Thanks for your contribution! Very Nice!

But I am not familiar with KiTTi, may I ask if this performance is acceptable?
Thanks! Looking forward your reply.

Haiyang-W · 2023-12-08T20:01:37Z

If this result turns out to be good, I will tag this issue to make it more accessible for those interested in running DSVT on KITTI.
Many thanks!

dinvincible98 · 2023-12-08T20:08:13Z

I adopted the pointpillar settings and it has a slightly better performance compared to the pointpillar. I trained the model with a sinlgle GPU so the performance might be furtherly improved with multi-gpu training I guess.

Haiyang-W · 2023-12-08T22:02:58Z

I adopted the pointpillar settings and it has a slightly better performance compared to the pointpillar. I trained the model with a sinlgle GPU so the performance might be furtherly improved with multi-gpu training I guess.

Perhaps some further adjustments can be made; DSVT performs much better on Waymo and NuScenes compared to PointPillar. At least, its performance on KITTI should be close to that of MsSVT.

Haiyang-W · 2023-12-09T13:23:54Z

Thank @dinvincible98 , it seems that this issue has been resolved to some extent. The issue will be closed.

Thank you all for your contributions and discussions. :)

evil-master · 2024-06-05T08:42:34Z

作者你好，我成功配置了环境以及训练了kitti的数据，但是在转onnx模型时遇到了点问题，请问这个需要填写的是我生成数据集的pkl文件吗？在deploy.py中的path，我生成的文件是pkl
####### read input #######
batch_dict = torch.load("path to batch_dict.pth", map_location="cuda")
inputs = batch_dict

evil-master · 2024-06-05T08:57:32Z

作者你好，我成功配置了环境以及训练了kitti的数据，但是在转onnx模型时遇到了点问题，请问这个需要填写的是我生成数据集的pkl文件吗？在deploy.py中的path，我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict

我在readme里面找到了inputdict.pth的下载地址，载入我基于kitii训练的权重，但是显示的报错是
File "deploy.py", line 134, in
inputs = model.vfe(inputs)
File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 219, in forward
features = pfn(features, unq_inv)
File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 37, in forward
x = self.linear(inputs)
File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (61800x11 and 10x96)

evil-master · 2024-06-05T09:28:46Z

作者你好，我成功配置了环境以及训练了kitti的数据，但是在转onnx模型时遇到了点问题，请问这个需要填写的是我生成数据集的pkl文件吗？在deploy.py中的path，我生成的文件是pkl ####### read input ####### batch_dict = torch.load("path to batch_dict.pth", map_location="cuda") inputs = batch_dict

我在readme里面找到了inputdict.pth的下载地址，载入我基于kitii训练的权重，但是显示的报错是 File "deploy.py", line 134, in inputs = model.vfe(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 219, in forward features = pfn(features, unq_inv) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/user/cjg/DSVT/pcdet/models/backbones_3d/vfe/dynamic_pillar_vfe.py", line 37, in forward x = self.linear(inputs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/user/anaconda3/envs/dsvt/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (61800x11 and 10x96)

我这边查到问题了，提供的点云是有6个参数，而kitti数据只有5个，所以需要去掉最后一个维度就会可以了

Haiyang-W added the help wanted Extra attention is needed label Nov 2, 2023

Haiyang-W added good first issue Good for newcomers and removed help wanted Extra attention is needed labels Dec 9, 2023

Haiyang-W closed this as completed Dec 9, 2023

Haiyang-W pinned this issue Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSVT-P trainning on Kitti Dataset #64

DSVT-P trainning on Kitti Dataset #64

dinvincible98 commented Oct 23, 2023 •

edited

Loading

xifen523 commented Oct 31, 2023

dinvincible98 commented Oct 31, 2023

xifen523 commented Nov 1, 2023

Haiyang-W commented Nov 1, 2023

dinvincible98 commented Nov 1, 2023 •

edited

Loading

dinvincible98 commented Nov 1, 2023 •

edited

Loading

evil-master commented Nov 2, 2023

Haiyang-W commented Nov 2, 2023 •

edited

Loading

Haiyang-W commented Nov 2, 2023

123susu commented Nov 7, 2023

Haiyang-W commented Dec 8, 2023

dinvincible98 commented Dec 8, 2023 •

edited

Loading

Haiyang-W commented Dec 8, 2023 •

edited

Loading

Haiyang-W commented Dec 8, 2023

dinvincible98 commented Dec 8, 2023

Haiyang-W commented Dec 8, 2023

Haiyang-W commented Dec 9, 2023

evil-master commented Jun 5, 2024

evil-master commented Jun 5, 2024

evil-master commented Jun 5, 2024

DSVT-P trainning on Kitti Dataset #64

DSVT-P trainning on Kitti Dataset #64

Comments

dinvincible98 commented Oct 23, 2023 • edited Loading

xifen523 commented Oct 31, 2023

dinvincible98 commented Oct 31, 2023

xifen523 commented Nov 1, 2023

Haiyang-W commented Nov 1, 2023

dinvincible98 commented Nov 1, 2023 • edited Loading

dinvincible98 commented Nov 1, 2023 • edited Loading

evil-master commented Nov 2, 2023

Haiyang-W commented Nov 2, 2023 • edited Loading

Haiyang-W commented Nov 2, 2023

123susu commented Nov 7, 2023

Haiyang-W commented Dec 8, 2023

dinvincible98 commented Dec 8, 2023 • edited Loading

Haiyang-W commented Dec 8, 2023 • edited Loading

Haiyang-W commented Dec 8, 2023

dinvincible98 commented Dec 8, 2023

Haiyang-W commented Dec 8, 2023

Haiyang-W commented Dec 9, 2023

evil-master commented Jun 5, 2024

evil-master commented Jun 5, 2024

evil-master commented Jun 5, 2024

dinvincible98 commented Oct 23, 2023 •

edited

Loading

dinvincible98 commented Nov 1, 2023 •

edited

Loading

dinvincible98 commented Nov 1, 2023 •

edited

Loading

Haiyang-W commented Nov 2, 2023 •

edited

Loading

dinvincible98 commented Dec 8, 2023 •

edited

Loading

Haiyang-W commented Dec 8, 2023 •

edited

Loading