v0.3.1 (Llama 3.2 Vision patch)
Overview
We've added full support for Llama 3.2 after it was announced, and this includes full/LoRA fine-tuning on the Llama3.2-1B, Llama3.2-3B base and instruct text models and Llama3.2-11B-Vision base and instruct text models. This means we now support the full end-to-end development of VLMs - fine-tuning, inference, and eval! We've also included a lot more goodies in a few short weeks:
- Llama 3.2 1B/3B/11B Vision configs for full/LoRA fine-tuning
- Updated recipes to support VLMs
- Multimodal eval via EleutherAI
- Support for torch.compile for VLMs
- Revamped generation utilities for multimodal support + batched inference for text only
- New knowledge distillation recipe with configs for Llama3.2 and Qwen2
- Llama 3.1 405B QLoRA fine-tuning on 8xA100s
- MPS support (beta) - you can now use torchtune on Mac!
New Features
Models
Multimodal
- Update recipes for multimodal support (#1548, #1628)
- Multimodal eval via EleutherAI (#1669, #1660)
- Multimodal compile support (#1670)
- Exportable multimodal models (#1541)
Generation
- Revamped generate recipe with multimodal support (#1559, #1563, #1674, #1686)
- Batched inference for text-only models (#1424, #1449, #1603, #1622)
Knowledge Distillation
Memory and Performance
- Compile FFT FSDP (#1573)
- Apply rope on k earlier for efficiency (#1558)
- Streaming offloading in (q)lora single device (#1443)
Quantization
- Update quantization to use tensor subclasses (#1403)
- Add int4 weight-only QAT flow targeting tinygemm kernel (#1570)
RLHF
- Adding generic preference dataset builder (#1623)
Miscellaneous
Documentation
- nits in memory optimizations doc (#1585)
- Tokenizer and prompt template docs (#1567)
- Latexifying IPOLoss docs (#1589)
- modules doc updates (#1588)
- More doc nits (#1611)
- update docs (#1602)
- Update llama3 chat tutorial (#1608)
- Instruct and chat datasets docs (#1571)
- Preference dataset docs (#1636)
- Messages and message transforms docs (#1574)
- Readme Updates (#1664)
- Model transform docs (#1665)
- Multimodal dataset builder + docs (#1667)
- Datasets overview docs (#1668)
- Update README.md (#1676)
- Readme updates for Llama 3.2 (#1680)
- Add 3.2 models to README (#1683)
- Knowledge distillation tutorial (#1698)
- Text completion dataset docs (#1696)
Quality-of-Life Improvements
- Set possible resolutions to debug, not info (#1560)
- Remove TiedEmbeddingTransformerDecoder from Qwen (#1547)
- Make Gemma use regular TransformerDecoder (#1553)
- llama 3_1 instantiate pos embedding only once (#1554)
- Run unit tests against PyTorch nightlies as part of our nightly CI (#1569)
- Support load_dataset kwargs in other dataset builders (#1584)
- add fused = true to adam, except pagedAdam (#1575)
- Move RLHF out of modules (#1591)
- Make logger only log on rank0 for Phi3 loading errors (#1599)
- Move rlhf tests out of modules (#1592)
- Update PR template (#1614)
- Update
get_unmasked_sequence_lengths
example 4 release (#1613) - remove ipo loss + small fixed (#1615)
- Fix dora configs (#1618)
- Remove unused var in generate (#1612)
- remove deprecated message (#1619)
- Fix qwen2 config (#1620)
- Proper names for dataset types (#1625)
- Make
q
optional insample
(#1637) - Rename
JSONToMessages
toOpenAIToMessages
(#1643) - update gemma to ignore gguf (#1655)
- Add Pillow >= 9.4 requirement (#1671)
- guard import (#1684)
- add upgrade to pip command (#1687)
- Do not run CI on forked repos (#1681)
Bug Fixes
- Fix flex attention test (#1568)
- Add
eom_id
to Llama3 Tokenizer (#1586) - Only merge model weights in LoRA recipe when
save_adapter_weights_only=False
(#1476) - Hotfix eval recipe (#1594)
- Fix typo in PPO recipe (#1607)
- Fix lora_dpo_distributed recipe (#1609)
- Fixes for MM Masking and Collation (#1601)
- delete duplicate LoRA dropout fields in DPO configs (#1583)
- Fix tune download command in PPO config (#1593)
- Fix tune run not identifying custom components (#1617)
- Fix compile error in
get_causal_mask_from_padding_mask
(#1627) - Fix eval recipe bug for group tasks (#1642)
- Fix basic tokenizer no special tokens (#1640)
- add BlockMask to batch_to_device (#1651)
- Fix PACK_TYPE import in collate (#1659)
- Fix llava_instruct_dataset (#1658)
- convert rgba to rgb (#1678)
New Contributors (auto-generated by GitHub)
- @dvorjackz made their first contribution (#1558)
Full Changelog: v0.3.0...v0.3.1