You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to train the RWKV6 block on 256x256 images. However, I found there is almost no GPU reduction relative to ViT. So what is the advantage of Vision-RWKV6 in this setting?
The text was updated successfully, but these errors were encountered:
Hi, thanks for your attention to VRWKV!
The reduction of GPU memory of VRWKV/VRWKV6 on low resolution is minimal (vrwkv may even cost more VRAMs or have a slower speed, as the original attention mechanism is optimized over several versions in common DL Frameworks). VRWKV will show its advantages in higher-resolution scenarios. In low-resolution cases, we show the VRWKV has comparable performance to ViT(so that it has the potential to replace ViT in most tasks currently and shows its advantages in high resolutions).
I try to train the RWKV6 block on 256x256 images. However, I found there is almost no GPU reduction relative to ViT. So what is the advantage of Vision-RWKV6 in this setting?
The text was updated successfully, but these errors were encountered: