Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[auto] Sync version 2304290124.0.0+llamacpp-release.master-7fc50c0
== Relevant log messages from source repo: commit 7fc50c051ae8a78e9643fdf172d12e20f2dd9b6c Author: slaren <[email protected]> Date: Sat Apr 29 02:04:18 2023 +0200 cuBLAS: use host pinned memory and dequantize while copying (#1207) * cuBLAS: dequantize simultaneously while copying memory * cuBLAS: use host pinned memory * cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory * cuBLAS: also pin kv cache * fix rebase commit b1ee8f59b4101b46999a0995d9a34506f7285466 Author: Henri Vasserman <[email protected]> Date: Sat Apr 29 02:31:56 2023 +0300 cuBLAS: non-contiguous tensor support (#1215) * Cuda: non-contiguous tensor support * remove extra stuff * rename * fix error * more fixes, now OpenBLAS and CLBlast build too * now then? commit 36d19a603b221d1bd7897fcb10e823e2103b052d Author: Stephan Walter <[email protected]> Date: Fri Apr 28 23:10:43 2023 +0000 Remove Q4_3 which is no better than Q5 (#1218)
- Loading branch information