Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Comet integration #1

Draft
wants to merge 42 commits into
base: main
Choose a base branch
from
Draft

Add Comet integration #1

wants to merge 42 commits into from

Commits on Jun 11, 2024

  1. Configuration menu
    Copy the full SHA
    52db752 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    463aad9 View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2024

  1. Configuration menu
    Copy the full SHA
    804cbcc View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    108f583 View commit details
    Browse the repository at this point in the history
  3. Add debugging logs

    Lothiraldan committed Jun 13, 2024
    Configuration menu
    Copy the full SHA
    90ef7ae View commit details
    Browse the repository at this point in the history
  4. Fix typo

    Lothiraldan committed Jun 13, 2024
    Configuration menu
    Copy the full SHA
    12e064b View commit details
    Browse the repository at this point in the history
  5. Fix typo

    Lothiraldan committed Jun 13, 2024
    Configuration menu
    Copy the full SHA
    d0b68d7 View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2024

  1. fix python version and pytest install (EleutherAI#1234)

    * fix python version and pytest install
    
    * Update NeoXArgs docs automatically
    
    * python3
    
    * Update NeoXArgs docs automatically
    
    * pip not pip3
    
    * Update NeoXArgs docs automatically
    
    * python3 pip
    
    * Update NeoXArgs docs automatically
    
    * python3 -m pip
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * add docker setup to workflow
    
    * Update NeoXArgs docs automatically
    
    * python setup
    
    * Update NeoXArgs docs automatically
    
    * python setup v2
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * python setup v3
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Add hash back to deep speed version
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Jun 19, 2024
    Configuration menu
    Copy the full SHA
    2608972 View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2024

  1. Add a chat data preprocessing script (EleutherAI#1239)

    * Add a chat data preprocessing script
    
    * add EOT at end of a chat
    
    * update README.md
    
    * apply pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    dmahan93 and Quentin-Anthony authored Jun 25, 2024
    Configuration menu
    Copy the full SHA
    0e5f6db View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2024

  1. Configuration menu
    Copy the full SHA
    1cee5b7 View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2024

  1. Add hf llama to neox conversion (EleutherAI#1247)

    * - Add conversion of HF llama models to NeoX
    
    * - Add conversion of HF llama models to NeoX
    
    * - minor fix
    
    * pre-commit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    dmahan93 and Quentin-Anthony authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    c1ea2a1 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. bugfix: chat turns instead of repeating the conversation in preproces…

    …s_data_with_chat_template.py (EleutherAI#1258)
    
    * bugfix: chat turns instead of repeating the conversation
    
    * pre-commit
    dmahan93 authored Aug 15, 2024
    Configuration menu
    Copy the full SHA
    0ef2c07 View commit details
    Browse the repository at this point in the history
  2. Conversion for CI from self-hosted hardware (EleutherAI#1245)

    * changing from self-hosted runners to Github's ubuntu-22.04 runner environment
    
    * adding warning about not using 'self-hosted' runner labels and using Github runners instead
    
    * updated some guidance in comments for coverity scan CI
    
    * moving CPU tests to workflow_dispatch only
    jaimemcc-intel authored Aug 15, 2024
    Configuration menu
    Copy the full SHA
    f8c9e68 View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2024

  1. Megatron-LM style Sequence Parallel (EleutherAI#1257)

    * first draft (shape errors occurring)
    
    * training works (but poor convergence)
    
    * debugging progress: current commit works if we do regular TP via impl-ing AR in rowparallel as RS then AG
    
    * Update NeoXArgs docs automatically
    
    * push most recent code (updated mark_norms fn, back to 'real' sequence parallel)
    
    * Update NeoXArgs docs automatically
    
    * Fix LayerNorm all reduce gradient hook
    
    * Sum instead of average for LayerNorm gradient all reduce
    
    * Update NeoXArgs docs automatically
    
    * Update NeoXArgs docs automatically
    
    * Fix gather and reduce scatter ops on sequence dimension
    
    * Fix sequence parallel with tied weight embeddings
    
    * Update NeoXArgs docs automatically
    
    * cleanup pass + add MoE arguments.py guard
    
    * pre-commit and clean up comments
    
    * remove vestigial debug code
    
    * remove unused debugging code
    
    * remove dummy test config
    
    * update fp32_allreduce to handle fp16 ; don't cast to fp32 for gathers
    
    * run linter on the rest of the files
    
    * Improve performance of sequence parallel gather, scatter, and reduce
    
    * Add comment
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Brandon Yang <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    4 people authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    8b43196 View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2024

  1. Add new cites (EleutherAI#1255)

    * Update README.md
    
    I added new models that have come out trained with the GPT-NeoX library. The library itself is sufficiently well-used that simply listing all citing papers is rapidly becoming non-viable. I'm currently leaning towards providing a curated list of "exciting" papers? I haven't looked at other libraries to see what they do yet.
    
    * Update NeoXArgs docs automatically
    
    ---------
    
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    3 people authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    e7c0182 View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. mamba fixes and cleaning (EleutherAI#1262)

    * mamba fixes and cleaning
    
    * space
    
    * revert assertion change for now
    
    ---------
    
    Co-authored-by: Jacob Hatef <[email protected]>
    jahatef and Jacob Hatef authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    591563d View commit details
    Browse the repository at this point in the history
  2. SFT improvements (labeling fixes, different packing implementations) (E…

    …leutherAI#1240)
    
    * - add different packing impl (Unpacked, packing until overflow)
    - fix labels to also have valid/test implementations
    - fix label masking in _get_batch to also include anything from get_ltor_masks_and_position_ids
    
    * Update arguments.py to use train_label_data_paths instead of label_data_paths
    
    * - fix precommit
    dmahan93 authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c786367 View commit details
    Browse the repository at this point in the history

Commits on Sep 3, 2024

  1. Configuration menu
    Copy the full SHA
    6a2053b View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. Configuration menu
    Copy the full SHA
    7548a8b View commit details
    Browse the repository at this point in the history

Commits on Sep 7, 2024

  1. Add intermediate_size to GPT-NeoX models (EleutherAI#1212)

    * Update transformer.py -> Add `intermediate_size`
    
    * add support for rwkv and mamba and add todos about swiglu
    
    * refactor activations and mlps
    
    * change llama config to swiglu
    
    * fixes gelu fusion
    
    * pre-commit run
    
    * add assert message to mamba linear
    
    * Update 1-3B.yml
    
    revert accidental change
    
    * Update 1-3B.yml
    
    * fixes various issues
    
    * add back swiglu check
    
    ---------
    
    Co-authored-by: jahatef <[email protected]>
    Co-authored-by: Quentin Anthony <[email protected]>
    Co-authored-by: Jacob Hatef <[email protected]>
    4 people authored Sep 7, 2024
    Configuration menu
    Copy the full SHA
    0d4bdb9 View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2024

  1. Configuration menu
    Copy the full SHA
    ec82c05 View commit details
    Browse the repository at this point in the history
  2. Add DPO training (EleutherAI#1242)

    * Add a chat data preprocessing script
    
    * add EOT at end of a chat
    
    * - add different packing impl (Unpacked, packing until overflow)
    - fix labels to also have valid/test implementations
    - fix label masking in _get_batch to also include anything from get_ltor_masks_and_position_ids
    
    * update README.md
    
    * - Add metrics to forward step to add DPO specific metrics that are useful (accuracy, etc)
    - Add reference model setup for DPO
    - Add pairwise dataset for positive/negative pairs
    - Add DPO loss
    
    * Update arguments.py to use train_label_data_paths instead of label_data_paths
    
    * - Bugfixes from upstreaming....
    
    * - add precompute logprobs...
    
    * - Finishing up precompute logprobs...
    
    * - update readme for DPO...
    
    * fix varname
    
    * Fix pipeline parallelism and incorrect neox_args name
    
    * apply precommit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    dmahan93 and Quentin-Anthony authored Sep 8, 2024
    Configuration menu
    Copy the full SHA
    77e8158 View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2024

  1. LayerNorm Refactor (EleutherAI#1269)

    * Add TE skeleton
    
    * Update NeoXArgs docs automatically
    
    * added option for te version of norms
    
    * import TERMSNorm
    
    * add te norm options to norm arg
    
    * add TE objects in weight decay function
    
    * reformat
    
    * add TERMSNorm and TELayerNorm
    
    * Update NeoXArgs docs automatically
    
    * - add Fused RMS Norm from apex
    
    * - make it consistent with how layernorm looks
    
    * Merged transformer engine and apex fused layernorm branches
    
    * Added assertion if TE is used
    
    * Removed unnecessary transformer-engine import
    
    * Changed importerror text for TE
    
    * Added requirements/requirements-transformerengine.txt
    
    * Add TE skeleton
    
    * Update NeoXArgs docs automatically
    
    * added option for te version of norms
    
    * import TERMSNorm
    
    * add te norm options to norm arg
    
    * add TE objects in weight decay function
    
    * reformat
    
    * add TERMSNorm and TELayerNorm
    
    * Update NeoXArgs docs automatically
    
    * - add Fused RMS Norm from apex
    
    * - make it consistent with how layernorm looks
    
    * Merged transformer engine and apex fused layernorm branches
    
    * Added assertion if TE is used
    
    * Removed unnecessary transformer-engine import
    
    * Changed importerror text for TE
    
    * Added requirements/requirements-transformerengine.txt
    
    * update comments
    
    * precommit
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    Co-authored-by: github-actions <[email protected]>
    Co-authored-by: lintangsutawika <lintang@stella-ord-0.stella-ord.tenant-eleutherai.svc.tenant.chi.local>
    Co-authored-by: lintangsutawika <[email protected]>
    Co-authored-by: dmahan93 <[email protected]>
    Co-authored-by: aurelion-source <[email protected]>
    Co-authored-by: aurelion-source <[email protected]>
    8 people authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    836aefa View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    01e74f4 View commit details
    Browse the repository at this point in the history
  3. TE Import Hotfix (EleutherAI#1272)

    * fix the te import
    
    * refactor get_params_for_weight_decay_optimization
    
    * remove incorrect type hint and dead imports
    Quentin-Anthony authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    61a3daa View commit details
    Browse the repository at this point in the history
  4. Add Reward Model training (EleutherAI#1246)

    * Add a chat data preprocessing script
    
    * add EOT at end of a chat
    
    * - add different packing impl (Unpacked, packing until overflow)
    - fix labels to also have valid/test implementations
    - fix label masking in _get_batch to also include anything from get_ltor_masks_and_position_ids
    
    * update README.md
    
    * - Add metrics to forward step to add DPO specific metrics that are useful (accuracy, etc)
    - Add reference model setup for DPO
    - Add pairwise dataset for positive/negative pairs
    - Add DPO loss
    
    * Update arguments.py to use train_label_data_paths instead of label_data_paths
    
    * - Bugfixes from upstreaming....
    
    * - add precompute logprobs...
    
    * - Finishing up precompute logprobs...
    
    * - update readme for DPO...
    
    * - Add RM training
    
    * add comment on why row-parallel for RMs
    
    * fix var name
    
    ---------
    
    Co-authored-by: Quentin Anthony <[email protected]>
    dmahan93 and Quentin-Anthony authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    1c72742 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bf8e78c View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    227967f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    2a3513c View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    7609829 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    39f9142 View commit details
    Browse the repository at this point in the history
  10. Add debugging logs

    Lothiraldan authored and Quentin-Anthony committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    43ed6e8 View commit details
    Browse the repository at this point in the history
  11. Fix typo

    Lothiraldan authored and Quentin-Anthony committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    90c499a View commit details
    Browse the repository at this point in the history
  12. Fix typo

    Lothiraldan authored and Quentin-Anthony committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    913f877 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    a6bddd6 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    0468dae View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    ef32d69 View commit details
    Browse the repository at this point in the history
  16. precommit

    Quentin-Anthony committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    976cd5d View commit details
    Browse the repository at this point in the history
  17. add comet config

    Quentin-Anthony committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    962314e View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    f0a4b70 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    c6681b5 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    4f76e0d View commit details
    Browse the repository at this point in the history