You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great paper! If I understand correctly, Mamba model is similar to a one directional LSTM. Is there a way to implement it in not causal but bidirectional way, so the model can see information from both sequence ends? It would be similar to BERT encoder architecture in that sense I guess.
The text was updated successfully, but these errors were encountered:
Hi! Thank you for your awesome work! I am also interested in using Mamba with different bidirectional methods. Besides PR #52, is there any new features related to the more advanced methods you mentioned? Thanks!
Hello,
Thanks for the great paper! If I understand correctly, Mamba model is similar to a one directional LSTM. Is there a way to implement it in not causal but bidirectional way, so the model can see information from both sequence ends? It would be similar to BERT encoder architecture in that sense I guess.
The text was updated successfully, but these errors were encountered: