feat: llamafile_sgemm bias support #111

chenghuaWang · 2024-08-06T03:01:37Z

Tested on X86 and Arm device using Qwenv1.5 model.

For further optimisation or neatness of gemm code: we can concat ones at the last col of $$X$$ and concat $$\text{Bias}$$ at the last row of $$W$$ when allocating Tensor if they have same precision.

chenghuaWang and others added 2 commits August 6, 2024 10:49

feat: llamafile_sgemm bias support

f1c519f

Merge branch 'main' into main

1ecb1df

yirongjie approved these changes Aug 6, 2024

View reviewed changes

yirongjie merged commit a1b451e into UbiquitousLearning:main Aug 6, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: llamafile_sgemm bias support #111

feat: llamafile_sgemm bias support #111

chenghuaWang commented Aug 6, 2024

feat: llamafile_sgemm bias support #111

feat: llamafile_sgemm bias support #111

Conversation

chenghuaWang commented Aug 6, 2024