Merge pull request #44 from RWKV/full-v5-R4-rewrite

Full v5 r4 rewrite
RWKV · Nov 15, 2023 · 1e287e9 · 1e287e9
2 parents d205164 + b2e5342
commit 1e287e9
Show file tree

Hide file tree

Showing 24 changed files with 1,360 additions and 170,658 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,5 @@
 # RWKV Infinite Context trainer
 
-**IMPORTANT NOTE: The infctx trainer is ~~broken~~ very slow for the current v5 r4 changes:** (which is what the 1b5 model is trained on)
-
 > If you are new to RWKV, it would be better to find out more about us via our wiki first here: https://wiki.rwkv.com/
 
 RWKV trainer with
@@ -23,6 +21,8 @@ Remember to modify the configuration for your own need.
 
 See [RWKV-v4neo/config-example.yaml](./RWKV-v4neo/config-example.yaml) for documentation on the various options
 
+> NOTE: Due to current incomplete implementation, without state gradient, bptt_truncate is forced to be true
+
 ## Environment setup
 
 > Note: There is a known issue with CUDA 12.0 and multi-gpu at this point of writing. Upgrade to CUDA 12.1 or 12.2 atleast Or downgrade to 11.8
@@ -82,7 +82,6 @@ python3 -m pip install -r requirements.txt
 - Start the training process `python3 lightning_trainer.py fit -c {your_config}.yaml`
 - Export the checkpoint after training is complete with `python3 export_checkpoint.py ../path/to/checkpoint/last.ckpt/ ../path/to/export/model.pth`
 - optional, run the dragon prompt as a quick sanity check `python3 dragon_test.py ../path/to/export/model.pth`
-- You should probably convert this to an fp16 model (todo script)
 
 In summary with code, from the trainer directory (eg. RWKV-v4neo)