Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FA3 kvcache + split kv + gqa parallelization #1236

Merged
merged 106 commits into from
Oct 15, 2024
Merged

FA3 kvcache + split kv + gqa parallelization #1236

merged 106 commits into from
Oct 15, 2024

Commits on Sep 30, 2024

  1. Adding the flash3 kv cache API. Just compiling for now.

    KV cache functionality not added yet.
    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    9dbd114 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7ee8ee4 View commit details
    Browse the repository at this point in the history
  3. added cache_batch_idx.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    3fdf7ee View commit details
    Browse the repository at this point in the history
  4. adding python interface.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    84e31c2 View commit details
    Browse the repository at this point in the history
  5. add test_kvcache.py.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    e16053a View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    be0e36d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    38ad0ac View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    57de4da View commit details
    Browse the repository at this point in the history
  9. add comparision with fa2.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    435f86d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    74f160b View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    13bad55 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    ccf5b9b View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    fc8f704 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    d2f049c View commit details
    Browse the repository at this point in the history
  15. fix o strides

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    c6311e4 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    535b827 View commit details
    Browse the repository at this point in the history
  17. fix some errors

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    5704a1f View commit details
    Browse the repository at this point in the history
  18. add causal logic

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    64a9cfb View commit details
    Browse the repository at this point in the history
  19. add to kv cache api

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    a06f1f9 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    0c4cea9 View commit details
    Browse the repository at this point in the history
  21. refactor for split kv

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    0a1a0c2 View commit details
    Browse the repository at this point in the history
  22. re-enable fp16/bf16 fwd

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    1135dbd View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    68ff3f7 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    23bf5b0 View commit details
    Browse the repository at this point in the history
  25. delete unused files

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    ac19795 View commit details
    Browse the repository at this point in the history
  26. add hid=64.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    1a5e40a View commit details
    Browse the repository at this point in the history
  27. change flash api for rebase

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    c75c243 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    e9db102 View commit details
    Browse the repository at this point in the history
  29. change Element to OutputType for template param in combine kernel. On…

    …ly matters for fp8 support
    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    9250969 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    ec0130f View commit details
    Browse the repository at this point in the history
  31. revert OutputType change

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    68b4bb9 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    9c97808 View commit details
    Browse the repository at this point in the history
  33. added num_split_heuristics.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    ecc5c49 View commit details
    Browse the repository at this point in the history
  34. update parameters

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    f3e5bd4 View commit details
    Browse the repository at this point in the history
  35. remove unused code

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    78736b4 View commit details
    Browse the repository at this point in the history
  36. add num_split_heuristics.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    e8c7b2e View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    75a6ce2 View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    cf5bd5c View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    ac96c37 View commit details
    Browse the repository at this point in the history
  40. add gqa decoding logic.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    099ca28 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    f53703b View commit details
    Browse the repository at this point in the history
  42. recent version.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    a30863f View commit details
    Browse the repository at this point in the history
  43. more refactoring.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    ffa48eb View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    4a77193 View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    3615696 View commit details
    Browse the repository at this point in the history
  46. add reference from python.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    24b4b4f View commit details
    Browse the repository at this point in the history
  47. Adding another test case.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    c516d63 View commit details
    Browse the repository at this point in the history
  48. add variable seqlen case.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    9a4941c View commit details
    Browse the repository at this point in the history
  49. all cases passed.

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    70ff847 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    e36e004 View commit details
    Browse the repository at this point in the history
  51. set correct tolerance limit

    ganeshcolfax authored and jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    cd55fb3 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    2472e5e View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    b5cac6d View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    6111666 View commit details
    Browse the repository at this point in the history
  55. add Is_local back in

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    be481ca View commit details
    Browse the repository at this point in the history
  56. prune unused code

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    81d4024 View commit details
    Browse the repository at this point in the history
  57. enable Is_local with fp8

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    64a0a91 View commit details
    Browse the repository at this point in the history
  58. update composable kernel

    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    0f560b7 View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    cffef15 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    2b840ef View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    5df67d2 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    aa45d75 View commit details
    Browse the repository at this point in the history
  63. Merge branch 'fa3-kvcache-gqa' of github.com:Dao-AILab/flash-attentio…

    …n into fa3-kvcache-gqa
    jayhshah committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    33f20a3 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    f77d9f7 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. Configuration menu
    Copy the full SHA
    5e3864f View commit details
    Browse the repository at this point in the history
  2. remove deprecated fp8 code

    jayhshah committed Oct 1, 2024
    Configuration menu
    Copy the full SHA
    16eb1e5 View commit details
    Browse the repository at this point in the history
  3. correct indent

    jayhshah committed Oct 1, 2024
    Configuration menu
    Copy the full SHA
    7940377 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    31c71e0 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6bb1092 View commit details
    Browse the repository at this point in the history

Commits on Oct 2, 2024

  1. add fp8 test case.

    ganeshcolfax committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    0085f04 View commit details
    Browse the repository at this point in the history
  2. fix submodule

    jayhshah committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    eaf8898 View commit details
    Browse the repository at this point in the history
  3. re-commiting.

    ganeshcolfax committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    a44596f View commit details
    Browse the repository at this point in the history
  4. Revert "re-commiting."

    This reverts commit a44596f.
    ganeshcolfax committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    c0c58ee View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b8f9dc2 View commit details
    Browse the repository at this point in the history
  6. lower rtol for fp8 a bit

    jayhshah committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    49f1849 View commit details
    Browse the repository at this point in the history
  7. separate gqa compilation

    jayhshah committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    bb230b8 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    03200a7 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. Configuration menu
    Copy the full SHA
    930c8ca View commit details
    Browse the repository at this point in the history
  2. add crude hdim 64 heuristic

    jayhshah committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    bc4b872 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. Configuration menu
    Copy the full SHA
    fff4b5c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    aa0e699 View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2024

  1. fix bug with fp8 q layout

    jayhshah committed Oct 7, 2024
    Configuration menu
    Copy the full SHA
    785d978 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2024

  1. Configuration menu
    Copy the full SHA
    8fbefa8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f0b4946 View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2024

  1. Configuration menu
    Copy the full SHA
    8f45a8c View commit details
    Browse the repository at this point in the history
  2. move IsRegToGmem

    jayhshah committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    4a4dbd2 View commit details
    Browse the repository at this point in the history

Commits on Oct 10, 2024

  1. Configuration menu
    Copy the full SHA
    a075e76 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dc2c952 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e49cb5f View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2024

  1. Configuration menu
    Copy the full SHA
    d437d3d View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2024

  1. Configuration menu
    Copy the full SHA
    ab5d336 View commit details
    Browse the repository at this point in the history
  2. unify rmem -> gmem methods

    jayhshah committed Oct 12, 2024
    Configuration menu
    Copy the full SHA
    7169b23 View commit details
    Browse the repository at this point in the history
  3. uniform notation

    jayhshah committed Oct 12, 2024
    Configuration menu
    Copy the full SHA
    551b91f View commit details
    Browse the repository at this point in the history

Commits on Oct 14, 2024

  1. add rmem -> gmem for fp8

    jayhshah committed Oct 14, 2024
    Configuration menu
    Copy the full SHA
    eb9c0ee View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b0f067e View commit details
    Browse the repository at this point in the history
  3. refactor names

    jayhshah committed Oct 14, 2024
    Configuration menu
    Copy the full SHA
    35f3542 View commit details
    Browse the repository at this point in the history
  4. remove test code

    jayhshah committed Oct 14, 2024
    Configuration menu
    Copy the full SHA
    8374e1f View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    1ecf821 View commit details
    Browse the repository at this point in the history

Commits on Oct 15, 2024

  1. remove Is_batch_dynamic from seqlen traits and handle fp8 perf regres…

    …sion using smem boolean
    jayhshah committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    7c1473e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c06cc0b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a7cce59 View commit details
    Browse the repository at this point in the history
  4. remove commented out code

    jayhshah committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    8efb953 View commit details
    Browse the repository at this point in the history
  5. prune more dead code

    jayhshah committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    b3d60fa View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    50cb90a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    dec7dee View commit details
    Browse the repository at this point in the history
  8. remove some debug code

    jayhshah committed Oct 15, 2024
    Configuration menu
    Copy the full SHA
    9b6cba1 View commit details
    Browse the repository at this point in the history