speed gpu #50

zzyy520 · 2024-03-30T08:04:53Z

您好，我有几个问题和发现。基于2080ti GPU对Repvit的不同尺寸规模的模型进行速度测试，其并不能展现比mobileOne-s2,s1以及fastvit-t8更高的速度。无论是throughput还是FPS等都比相关的同精度算法模型要慢。（对比上述模型主要是因为均采用结构重参数）

jameslahm · 2024-03-30T08:23:31Z

Thanks for your interest. The benchmark results on our 2080ti device are below:

Model	Input	Throughput (bs=1024)
RepViT-M0.9	224	2870
FastViT-T8	256	2379 (bs=768 because OOM when bs=1024)
MobileOne-S1	224	2745

May you provide more details about your benchmark results?

zzyy520 · 2024-03-30T08:52:23Z

Thanks for your reply. The benchmark results on ours 2080ti GPU are below:
Model Input Throughput(bs=512)
MobileOne-s2 160 4152
MobileOne-s1 160 5523
RepViT-M1 160 5522
RepViT-M2 160 4708

(if bs=1)

MobileOne-s2 160 479
....-s1 160 429
RepViT-M1 160 200
RepViT-M2 160 182
FastVit-T8 160 325

Does this mean that the model is difficult to apply to the problem of single graph transmission single graph inference under the high-speed camera?

jameslahm · 2024-03-31T12:57:27Z

Thanks. We thought that it depends on the device. For example, RepViT-M0.9 runs as fast as MobileOne-S1 on iPhone 12 with bs=1. On the 2080Ti with bs=1, we suggest that you could locate some inference bottleneck. For example, SE layer with bs=1 may cause extra apparent latency on 2080Ti, which is not like on the iPhone. Besides, we suggest that you could improve the performance on 2080Ti with TensorRT. We will also try to improve the performance of RepViT in such case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed gpu #50

speed gpu #50

zzyy520 commented Mar 30, 2024

jameslahm commented Mar 30, 2024

zzyy520 commented Mar 30, 2024

jameslahm commented Mar 31, 2024

speed gpu #50

speed gpu #50

Comments

zzyy520 commented Mar 30, 2024

jameslahm commented Mar 30, 2024

zzyy520 commented Mar 30, 2024

jameslahm commented Mar 31, 2024