Skip to content

Pull requests: codeplaysoftware/cutlass-sycl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add flash attention prefill shapes to benchmarks
#330 opened Apr 25, 2025 by t4c1 Loading…
Add BMG workflow
#328 opened Apr 24, 2025 by aacostadiaz Draft
Input alignment
#323 opened Apr 22, 2025 by t4c1 Loading…
add gemm with rmsnorm
#321 opened Apr 22, 2025 by yuankuns Loading…
Rename cutlass components from PVC to Xe
#320 opened Apr 21, 2025 by aacostadiaz Loading…
add int8/tf32 transpose A copy traits
#319 opened Apr 21, 2025 by taozha2 Loading…
Extend FlashAttention Prefill with KV cache
#318 opened Apr 19, 2025 by min-jean-cho Loading…
Implement Flash Decode for Xe hardware
#317 opened Apr 17, 2025 by muhammad-tanvir-1211 Loading…
Define benchmarking input file for April release
#316 opened Apr 17, 2025 by joeatodd Loading…
generalize collective builder across more tile shapes
#315 opened Apr 17, 2025 by t4c1 Loading…
Pure FP8 (W8A8) GEMM support (draft)
#306 opened Apr 14, 2025 by jiyang1011 Loading…
Use Collective builder for benchmarks
#302 opened Apr 10, 2025 by FMarno Loading…
Update ICD
#298 opened Apr 8, 2025 by aacostadiaz Loading…
Add base framework for FlashAttention unit tests
#295 opened Apr 7, 2025 by aacostadiaz Loading…
Split cuda workflow
#292 opened Apr 3, 2025 by aacostadiaz Draft
Enable SM90 via sycl-cuda-compat
#276 opened Mar 24, 2025 by FMarno Loading…
Enable batch tests for streamK
#258 opened Mar 12, 2025 by aacostadiaz Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.