Releases: TransformerLensOrg/TransformerLens
v2.0.0
TransformerLens officially has a 2.0! The HookedSAETransformer has been removed from TransformerLens in favor of the implementation in SAELens. Along with that, a lot of cumulative changes have been added. TransformerLens also has its first official development road map! To see full details, please view the release announcement for the development roadmap, changes coming to contributors, and a few more notes on this release.
What's Changed
- Refactor components by @bryce13950 in #563
- added convenience function for unwrapping config to replace commonly … by @bryce13950 in #571
- unwrapped config by @bryce13950 in #577
- Refactor integration tests by @bryce13950 in #576
- Add Mistral 7B v0.2 Instruct by @fakerybakery in #579
- Add support for Phi-3 by @slash3g in #573
- Revert "Add Mistral 7B v0.2 Instruct" by @bryce13950 in #586
- Interactive neuroscope ci by @bryce13950 in #589
- removed Hooked SAE by @bryce13950 in #600
- Release 1.18 by @bryce13950 in #602
- More pytest fixtures by @bmillwood in #609
- (v3) Draft PR: add Pyright static typing to hook_points.py #590 by @starship006 in #607
- v1.19 by @bryce13950 in #614
- add n k v heads to model properties table by @anthonyduong9 in #610
- fixed format by @bryce13950 in #616
- Add tests for hook point add hook by @anthonyduong9 in #617
- added release blog by @bryce13950 in #618
- Fix llama demos by @bryce13950 in #619
- added news link by @bryce13950 in #620
- Release 2.0 by @bryce13950 in #582
New Contributors
- @fakerybakery made their first contribution in #579
- @slash3g made their first contribution in #573
- @bmillwood made their first contribution in #609
- @starship006 made their first contribution in #607
- @anthonyduong9 made their first contribution in #610
Full Changelog: v1.19.0...v2.0.0
v1.19.0
Nice little update to fix a bug someone found, and added support for ai-forever models.
What's Changed
- Add support for ai-forever/mGPT model by @SeuperHakkerJa in #606
- moved enable hook functionality to separate functions and tested new functions by @bryce13950 in #613
Full Changelog: v1.18.0...v1.19.0
v1.18.0
Very important release for those using Gemma models. A recent upstream change caused the TransformerLens implementation to become outdated. This release fixes that issue, and includes a number of cumulative changes, and bug fixes. The only API change in this release is that you can now set override trust_remote_code
in the function from_pretrained
. Thanks to all who contributed to this release!
What's Changed
- reworked CI to publish code coverage report by @bryce13950 in #559
- Resolve SAE CI Test failures by @bryce13950 in #560
- Ci coverage location by @bryce13950 in #561
- Ci full coverage by @bryce13950 in #562
- moved coverage report download by @bryce13950 in #564
- Revert "moved coverage report download (#564)" by @bryce13950 in #565
- Othello ci by @bryce13950 in #567
- moved report to static section by @bryce13950 in #566
- Fix broken HookedSAETransformer demo links by @ckkissane in #572
- Fix Pos Slice Issue by @hannamw in #578
- Hf secret by @bryce13950 in #552
- updated pull reqeust template to account for new dev branch by @bryce13950 in #581
- updated PR template to add a note about merging from different branches by @bryce13950 in #583
- updated repo URL throughout the project by @bryce13950 in #580
- Fix docs badge in README by @ArthurConmy in #585
- added debug step by @bryce13950 in #568
- Update Gemma to reflect upstream HF changes by @cmathw in #596
- allow user to force trust_remote_code=true via from_pretrained kwargs by @Butanium in #597
Full Changelog: v1.17.0...v1.18.0
New feature: HookedSAETransformer!
v1.16.0
Lots of feature additions (thanks @joelburget for Llama support, and @sheikheddy for Llama-2-70b-chat-hf support!), and also a very helpful bugfix from @wesg52. Thanks to all contributors, especially new contributors!
What's Changed
- Add support for Llama-2-70b-chat-hf by @sheikheddy in #525
- Update loading_from_pretrained.py by @jbloomAus in #529
- Bugfix: pytest import by @tkukurin in #532
- Remove non-existing parameter from decompose_resid() documentation by @VasilGeorgiev39 in #504
- Add
@overload
toFactoredMatrix.__{,r}matmul__
by @JasonGross in #512 - Improve documentation for abstract attribute by @Felhof in #508
- Add pos_slice to run_with_cache by @VasilGeorgiev39 in #465
- Add Support for Yi-6B and Yi-34B by @collingray in #494
- updated docs to account for additional test suites by @bryce13950 in #533
- Bugfix: remove redundant assert checks by @tkukurin in #534
- Speed up !pip install transformer-lens in colab by @pavanyellow in #510
- Add Xavier and Kaiming Initializations by @Chanlaw in #537
- chore: fixing type errors and enabling mypy by @chanind in #516
- Add Mixtral by @collingray in #521
- Standardize black line length to 100, in line with other project settings by @Chanlaw in #538
- Refactor hook_points by @VasilGeorgiev39 in #505
- Fix split_qkv_input for grouped query attention by @wesg52 in #520
- locked attribution patching to 1.1.1 by @bryce13950 in #541
- Demo no position fix by @bryce13950 in #544
- Othello colab fix by @bryce13950 in #545
- Fixed Santa Coder demo by @bryce13950 in #546
- Hf token auth by @bryce13950 in #550
- Fixed device being set to cpu:0 instead of cpu by @Butanium in #551
- Add support for Llama 3 (and Llama-2-70b-hf) by @joelburget in #549
- Loading of huggingface 4-bit quantized Llama by @coolvision in #486
- removed deuplicate rearrange block by @bryce13950 in #555
- Bert demo ci by @bryce13950 in #556
New Contributors
- @sheikheddy made their first contribution in #525
- @tkukurin made their first contribution in #532
- @VasilGeorgiev39 made their first contribution in #504
- @JasonGross made their first contribution in #512
- @pavanyellow made their first contribution in #510
- @Chanlaw made their first contribution in #537
- @chanind made their first contribution in #516
- @wesg52 made their first contribution in #520
- @Butanium made their first contribution in #551
- @coolvision made their first contribution in #486
Full Changelog: v1.15.0...v1.16.0
v1.15.0: Gemma, Qwen, Phi!
What's Changed
- Support Phi Models by @cmathw in #484
- Remove redundant MLP bias assignment by @adamkarvonen in #485
- add qwen1.5 models by @andyrdt in #507
- Support Gemma Models by @cmathw in #511
- make tests pass mps by @jbloomAus in #528
New Contributors
Full Changelog: v1.14.0...v1.15.0
v1.14.0
What's Changed
- Implement RMS Layer Norm folding by @collingray in #489
- Cap Mistral's context length at 2k by @collingray in #495
New Contributors
- @collingray made their first contribution in #489
Full Changelog: v1.13.0...v1.13.1
v1.13.0
What's Changed
- Add support for CodeLlama-7b by @YuhengHuang42 in #469
- Make LLaMA 2 loadable directly from HF by @andyrdt in #458
- Fixes #371: LLAMA load on CUDA. Expected all tensors to be on the sam… by @artkpv in #461
- Extending Support for Additional Bloom Models (up to 7b) by @SeuperHakkerJa in #447
- Support mistral 7 b by @Felhof in #443
New Contributors
- @YuhengHuang42 made their first contribution in #469
- @andyrdt made their first contribution in #458
- @artkpv made their first contribution in #461
Full Changelog: v1.12.1...v1.13.0
v1.12.1
Adds Qwen, thanks @Aaquib111 and @andyrdt !
What's Changed
- Closes #478: Adding the Qwen family of models by @Aaquib111 in #477
- Add a function to convert nanogpt weights by @adamkarvonen in #475
New Contributors
- @Aaquib111 made their first contribution in #477
- @adamkarvonen made their first contribution in #475
Full Changelog: v1.12.0...v1.13.0