[rwkv-x] v5 model memory scaling benchmark #175
PicoCreator
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First, outlining the status quo, V5 has proven to beat out all existing V4, and V5+wavenet architecture in being able to retain memory for L24/D2048. The attached is the high level performance
The goal, is to figure out how this scales, by layer/embedding size/head size/ etc
Follow on the following discord thread: https://discord.com/channels/992359628979568762/1142397705314906182/1142397998853279744
Beta Was this translation helpful? Give feedback.
All reactions