Skip to content

Commit

Permalink
Merge pull request #44 from RWKV/full-v5-R4-rewrite
Browse files Browse the repository at this point in the history
Full v5 r4 rewrite
  • Loading branch information
PicoCreator authored Nov 15, 2023
2 parents d205164 + b2e5342 commit 1e287e9
Show file tree
Hide file tree
Showing 24 changed files with 1,360 additions and 170,658 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# RWKV Infinite Context trainer

**IMPORTANT NOTE: The infctx trainer is ~~broken~~ very slow for the current v5 r4 changes:** (which is what the 1b5 model is trained on)

> If you are new to RWKV, it would be better to find out more about us via our wiki first here: https://wiki.rwkv.com/
RWKV trainer with
Expand All @@ -23,6 +21,8 @@ Remember to modify the configuration for your own need.

See [RWKV-v4neo/config-example.yaml](./RWKV-v4neo/config-example.yaml) for documentation on the various options

> NOTE: Due to current incomplete implementation, without state gradient, bptt_truncate is forced to be true
## Environment setup

> Note: There is a known issue with CUDA 12.0 and multi-gpu at this point of writing. Upgrade to CUDA 12.1 or 12.2 atleast Or downgrade to 11.8
Expand Down Expand Up @@ -82,7 +82,6 @@ python3 -m pip install -r requirements.txt
- Start the training process `python3 lightning_trainer.py fit -c {your_config}.yaml`
- Export the checkpoint after training is complete with `python3 export_checkpoint.py ../path/to/checkpoint/last.ckpt/ ../path/to/export/model.pth`
- optional, run the dragon prompt as a quick sanity check `python3 dragon_test.py ../path/to/export/model.pth`
- You should probably convert this to an fp16 model (todo script)

In summary with code, from the trainer directory (eg. RWKV-v4neo)

Expand Down
Loading

0 comments on commit 1e287e9

Please sign in to comment.