Skip to content

Pull requests: codeplaysoftware/cutlass-sycl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add BMG workflow
#328 opened Apr 24, 2025 by aacostadiaz Draft
Fix IGC issue
#326 opened Apr 24, 2025 by aacostadiaz Draft
Fix failing unit tests on BMG
#324 opened Apr 22, 2025 by joeatodd Loading…
Input alignment
#323 opened Apr 22, 2025 by t4c1 Loading…
add gemm with rmsnorm
#321 opened Apr 22, 2025 by yuankuns Loading…
Rename cutlass components from PVC to Xe
#320 opened Apr 21, 2025 by aacostadiaz Loading…
add int8/tf32 transpose A copy traits
#319 opened Apr 21, 2025 by taozha2 Loading…
Extend FlashAttention Prefill with KV cache
#318 opened Apr 19, 2025 by min-jean-cho Loading…
Implement Flash Decode for Xe hardware
#317 opened Apr 17, 2025 by muhammad-tanvir-1211 Loading…
Define benchmarking input file for April release
#316 opened Apr 17, 2025 by joeatodd Loading…
generalize collective builder across more tile shapes
#315 opened Apr 17, 2025 by t4c1 Loading…
Pure FP8 (W8A8) GEMM support (draft)
#306 opened Apr 14, 2025 by jiyang1011 Loading…
Use Collective builder for benchmarks
#302 opened Apr 10, 2025 by FMarno Loading…
Update ICD
#298 opened Apr 8, 2025 by aacostadiaz Loading…
Add base framework for FlashAttention unit tests
#295 opened Apr 7, 2025 by aacostadiaz Loading…
Split cuda workflow
#292 opened Apr 3, 2025 by aacostadiaz Draft
Enable SM90 via sycl-cuda-compat
#276 opened Mar 24, 2025 by FMarno Loading…
Enable batch tests for streamK
#258 opened Mar 12, 2025 by aacostadiaz Loading…
ProTip! Follow long discussions with comments:>50.