You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A proposal draft (if any)
The proposal is to extend the quantization support in MMRAZOR to integrate MMPose quantization. This will involve expanding TensorRT and NNCF quantization support to MMPose models. The key challenge is that certain components like RTMOHead used in real-time multi-object (RTMO) tasks cannot be directly converted into an FXGraph, which is required for quantization. Addressing this would require modifying the model architecture or implementing custom layers that are compatible with FXGraph-based quantization frameworks.
Inspiration can be drawn from a similar effort done to support quantization in MMDET using MMRAZOR, as seen here. This integration will ensure that MMPose models can be efficiently deployed on edge devices with lower precision (e.g., INT8), reducing inference latency and memory footprint while maintaining high accuracy.
To ensure the quantization process yields tangible benefits, performance benchmarks need to be established to demonstrate gains in model size, inference speed, and accuracy post-quantization.
The quantized model must be deployed on NNCF (CPU) and TensoRT (GPU) engines.
Key issues:
Some components like RTMOHead cannot be directly converted into FXGraph.
Model-specific layers or operations may need custom handling for quantization.
Additional context
Quantization support for MMPose is crucial for applications that require real-time pose estimation on low-power devices, such as mobile apps, robotics, and AR/VR systems. The goal is to enhance the usability of MMPose models in these scenarios by providing lightweight, high-performance models through MMRAZOR's quantization workflow. Benchmarking will include a comparison of model accuracy, size, and speed on various edge devices, showing the performance gains of the quantized models.
The text was updated successfully, but these errors were encountered:
Feature type?
Algorithm request
A proposal draft (if any)
The proposal is to extend the quantization support in MMRAZOR to integrate MMPose quantization. This will involve expanding TensorRT and NNCF quantization support to MMPose models. The key challenge is that certain components like
RTMOHead
used in real-time multi-object (RTMO) tasks cannot be directly converted into anFXGraph
, which is required for quantization. Addressing this would require modifying the model architecture or implementing custom layers that are compatible with FXGraph-based quantization frameworks.Inspiration can be drawn from a similar effort done to support quantization in MMDET using MMRAZOR, as seen here. This integration will ensure that MMPose models can be efficiently deployed on edge devices with lower precision (e.g., INT8), reducing inference latency and memory footprint while maintaining high accuracy.
To ensure the quantization process yields tangible benefits, performance benchmarks need to be established to demonstrate gains in model size, inference speed, and accuracy post-quantization.
The quantized model must be deployed on NNCF (CPU) and TensoRT (GPU) engines.
Key issues:
RTMOHead
cannot be directly converted intoFXGraph
.Additional context
Quantization support for MMPose is crucial for applications that require real-time pose estimation on low-power devices, such as mobile apps, robotics, and AR/VR systems. The goal is to enhance the usability of MMPose models in these scenarios by providing lightweight, high-performance models through MMRAZOR's quantization workflow. Benchmarking will include a comparison of model accuracy, size, and speed on various edge devices, showing the performance gains of the quantized models.
The text was updated successfully, but these errors were encountered: