Open
Description
Hi lucidrains,
Thank you for your excellent code.
I am curious about the generation scripts. Could you tell me how to generate text with the compressive transformer? Because it has the compressive memory, maybe we cannot use the current predicted word as the input for the next generation (input length ==1). In addition, if the prompt has 100 words and we use tokens [0:100], tokens[1:101], tokens[2:102]... as the input for the following timesteps, the tokens[1:100] may overlap with the memory, because the memory already contains hidden states for tokens[1:100].
I would be very appeciated if you can provide the generation scripts!
Thank you
Metadata
Metadata
Assignees
Labels
No labels