Releases: lucidrains/block-recurrent-transformer-pytorch
Releases · lucidrains/block-recurrent-transformer-pytorch
0.2.1
take care of removing padding from the blocked keys and values of the…
0.2.0
update the state block by block, rather than one segment at a time, t…
0.1.4
actually use the recurrent states instead of always initial state, th…
0.1.2
cache causal mask and rotary positional embeddings
0.1.1
bug with xpos for rotary embeddings
0.1.0
it turns out flash attention in pytorch 2.0 is not handling causal co…
0.0.19
switch to rotary positional embedding with xpos, to prepare for flash…
0.0.18
allow for returning memories and states during training, for potentia…
0.0.17
0.0.17
0.0.16
add single headed kv