You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
just have a dumb question, if Q and KV has different T, T_q and T_kv, then different chunk_num, I guess it should be like looping over KV on [j] where j in range[0, T_kv // C ] and then on i in range[0, T_q //C] , but I am new to Triton not sure about the pointer and the i loop,
Hi, I was researching, wondier if your efficient gated attention - could extending it to asymmetric Q/KV lengths help bridge modalities too? This small architectural shift could enable time series/LLM alignment research or even other cross modalities areas. And honestly I've tried with GLA from my own but found diffuclt for me to debug, that why I would like to request it. LOL
Feature Request
Allow Q K has different sequence len in which I'd like to do a cross modality alignment on GLA?
Motivation
I tried to change it on my own but find it has a lot of knowledge and module tangle, so diffcult to fix
Your Contribution
If I can get some info on where to fix I think I can help to enable the seq_q and seq_kv
The text was updated successfully, but these errors were encountered: