You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am utilizing the mamba.py to create an end-to-end model with multiple Mamba blocks stacked. After training the model, when I am running the model in inference mode: It is a very larger inference time for the first time step(1 sec), and then from 2nd time step onwards it is taking time of order 0.002 sec. Hence, when compared with S4D model on the same task mamba model turns out to be slow.
What could I be doing wrong in my implementation?
I tracked down the time and found that for the first time step :
y = selective_state_update(ssm_state, x, dt, A, B, C, self.D, z=z, dt_bias=self.dt_proj.bias, dt_softplus=True)
this line takes the maximum time of computation.
Thanks
The text was updated successfully, but these errors were encountered:
I am utilizing the mamba.py to create an end-to-end model with multiple Mamba blocks stacked. After training the model, when I am running the model in inference mode: It is a very larger inference time for the first time step(1 sec), and then from 2nd time step onwards it is taking time of order 0.002 sec. Hence, when compared with S4D model on the same task mamba model turns out to be slow.
What could I be doing wrong in my implementation?
I tracked down the time and found that for the first time step :
y = selective_state_update(ssm_state, x, dt, A, B, C, self.D, z=z, dt_bias=self.dt_proj.bias, dt_softplus=True)
this line takes the maximum time of computation.
Thanks
The text was updated successfully, but these errors were encountered: