-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Comparing changes
Open a pull request
base repository: ggml-org/llama.cpp
base: dae06c06e5c6232ae2be4d567dd5101e1e96c814
head repository: ggml-org/llama.cpp
compare: 64e64aa2557d97490b2fe1262b313e2f4a1607e3
Commits on Nov 20, 2023
-
speculative : fix prompt tokenization in speculative example (#4025)
* Support special tokens and not adding BOS to prompt in speculative * Adapt to new should_add_bos function * Ensure tgt and dft have same add_bos setting
Configuration menu - View commit details
-
Copy full SHA for 40a34fe - Browse repository at this point
Copy the full SHA 40a34feView commit details -
ci : add flake8 to github actions (python linting) (#4129)
Disabled rules: * E203 Whitespace before ':' - disabled because we often use 'C' Style where values are aligned * E211 Whitespace before '(' (E211) - disabled because we often use 'C' Style where values are aligned * E221 Multiple spaces before operator - disabled because we often use 'C' Style where values are aligned * E225 Missing whitespace around operator - disabled because it's broken so often it seems like a standard * E231 Missing whitespace after ',', ';', or ':' - disabled because we often use 'C' Style where values are aligned * E241 Multiple spaces after ',' - disabled because we often use 'C' Style where values are aligned * E251 Unexpected spaces around keyword / parameter equals - disabled because it's broken so often it seems like a standard * E261 At least two spaces before inline comment - disabled because it's broken so often it seems like a standard * E266 Too many leading '#' for block comment - sometimes used as "section" separator * E501 Line too long - disabled because it's broken so often it seems like a standard * E701 Multiple statements on one line (colon) - broken only in convert.py when defining abstract methods (we can use# noqa instead) * E704 Multiple statements on one line - broken only in convert.py when defining abstract methods (we can use# noqa instead)
Configuration menu - View commit details
-
Copy full SHA for f23c035 - Browse repository at this point
Copy the full SHA f23c035View commit details -
main : Add ChatML functionality to main example (#4046)
Co-authored-by: Sebastian Cramond <sebby37@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 881800d - Browse repository at this point
Copy the full SHA 881800dView commit details -
readme : update ROCm Windows instructions (#4122)
* Update README.md * Update README.md Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> --------- Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for dfc7cd4 - Browse repository at this point
Copy the full SHA dfc7cd4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b871f1 - Browse repository at this point
Copy the full SHA 0b871f1View commit details
Commits on Nov 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 8e672ef - Browse repository at this point
Copy the full SHA 8e672efView commit details
Commits on Nov 23, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ff8238f - Browse repository at this point
Copy the full SHA ff8238fView commit details -
examples : fix typo in parallel example doc comment (#4181)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 9d5949f - Browse repository at this point
Copy the full SHA 9d5949fView commit details -
Configuration menu - View commit details
-
Copy full SHA for d103d93 - Browse repository at this point
Copy the full SHA d103d93View commit details -
llama : KV cache view API + better KV cache management (#4170)
* llama : keep track of used KV cells + better KV cache management * llama : zero KV cache used upon clear ggml-ci * llama : allow exporting a view of the KV cache (#4180) * Allow exporting a view of the KV cache * Allow dumping the sequences per cell in common * Track max contiguous cells value and position as well * Fix max contiguous empty cells index calculation Make dump functions deal with lengths or sequences counts > 10 better * Fix off by one error in dump_kv_cache_view * Add doc comments for KV cache view functions Eliminate cell sequence struct; use llama_seq_id directly Minor cleanups * common : add -dkvc arg for enabling kv cache dumps --------- Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 6b0a742 - Browse repository at this point
Copy the full SHA 6b0a742View commit details -
Fix incorrect format strings and uninitialized variables. (#4133)
* Fix incorrect format strings and uninitialized variables. * Address comments * Add the missing include statement
Configuration menu - View commit details
-
Copy full SHA for 55978ce - Browse repository at this point
Copy the full SHA 55978ceView commit details
Commits on Nov 24, 2023
-
readme : use PATH for Windows ROCm (#4195)
* Update README.md to use PATH for Windows ROCm * Update README.md * Update README.md
Configuration menu - View commit details
-
Copy full SHA for b35f3d0 - Browse repository at this point
Copy the full SHA b35f3d0View commit details -
main.swift : fix eos checking (#4197)
llama_token_eos(const struct llama_model *) is currently getting struct llama_context type variable context as a parameter.
Configuration menu - View commit details
-
Copy full SHA for 2568a4b - Browse repository at this point
Copy the full SHA 2568a4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 189d684 - Browse repository at this point
Copy the full SHA 189d684View commit details -
ggml-cuda : support stablelm rope (#4156)
* ggml-cuda : support stablelm rope * remove unused freq_base kernel parameter * add n_dims parameter to llm_build_k_shift, default to n_rot via overload * llama : fix llm_build_k_shift args --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 8a052c1 - Browse repository at this point
Copy the full SHA 8a052c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9c13ff - Browse repository at this point
Copy the full SHA e9c13ffView commit details
Commits on Nov 25, 2023
-
server : OAI API compatibility (#4198)
* Add openai-compatible POST /v1/chat/completions API endpoint to server example * fix code style * Update server README.md * Improve server README.md * Fix server.cpp code style according to review * server : some style changes * server : indentation * server : enable special tokens during tokenization by default * server : minor code style * server : change random string generator * straightforward /v1/models endpoint --------- Co-authored-by: kir-gadjello <111190790+kir-gadjello@users.noreply.github.com> Co-authored-by: Tobi Lütke <tobi@Tobis-MacBook-Pro.local>
Configuration menu - View commit details
-
Copy full SHA for af19d35 - Browse repository at this point
Copy the full SHA af19d35View commit details -
Configuration menu - View commit details
-
Copy full SHA for 04814e7 - Browse repository at this point
Copy the full SHA 04814e7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3014b54 - Browse repository at this point
Copy the full SHA 3014b54View commit details -
llama : grammar
reserve
space indecode_utf8
(#4210)* reserve space for codepoints * improvement for the appended 0
Configuration menu - View commit details
-
Copy full SHA for f837c3a - Browse repository at this point
Copy the full SHA f837c3aView commit details -
scripts : Use mmap in torch load (#4202)
* Use mmap in torch load, prefer .bin files when loading * Revert .bin > .safetensors preference
Configuration menu - View commit details
-
Copy full SHA for 1ddb52e - Browse repository at this point
Copy the full SHA 1ddb52eView commit details
Commits on Nov 26, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 22da055 - Browse repository at this point
Copy the full SHA 22da055View commit details -
lookahead : add example for lookahead decoding (#4207)
* lookahead : init * lookahead : generate and store n-grams * lookahead : use loop instead recursion to generate n-grams * lookahead : initial working implementation * lookahead : filter repeating n-grams * lookahead : use deterministic init * lookahead : add to Makefile * lookahead : fix a bug in the seq_id of the lookahead tokens * lookahead : add comments --------- Co-authored-by: slaren <slarengh@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 922754a - Browse repository at this point
Copy the full SHA 922754aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9656026 - Browse repository at this point
Copy the full SHA 9656026View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e73d31 - Browse repository at this point
Copy the full SHA 3e73d31View commit details
Commits on Nov 27, 2023
-
Configuration menu - View commit details
-
Copy full SHA for f3b2698 - Browse repository at this point
Copy the full SHA f3b2698View commit details -
examples : iOS example with swift ui (#4159)
* copy to llama.cpp as subdir * attempt enabling metal, fails * ggml metal compiles! * Update README.md * initial conversion to new format, utf8 errors? * bug fixes, but now has an invalid memory access :( * added O3, now has insufficient memory access * begin sync with master * update to match latest code, new errors * fixed it! * fix for loop conditionals, increase result size * fix current workflow errors * attempt a llama.swiftui workflow * Update .github/workflows/build.yml Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1Configuration menu - View commit details
-
Copy full SHA for bb03290 - Browse repository at this point
Copy the full SHA bb03290View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0dab8cd - Browse repository at this point
Copy the full SHA 0dab8cdView commit details -
cmake : fix issue with version info not getting baked into LlamaConfi…
…g.cmake (#3970) * Split CPP generation from build-info query * Remove blank lines * Add BUILD_SHARED_LIBS option
Configuration menu - View commit details
-
Copy full SHA for b38a16d - Browse repository at this point
Copy the full SHA b38a16dView commit details
Commits on Nov 28, 2023
-
ggml : re-enable BLAS for CPU when src0 != F32 + remove redundant ful…
…l offload checks in llama.cpp (#4240) * ggml : use blas even if src0 is not F32 * llama : use n_threads_batch only when n_tokens >= 32 ggml-ci * llama : revert n_threads_batch logic ggml-ci
Configuration menu - View commit details
-
Copy full SHA for 8406b09 - Browse repository at this point
Copy the full SHA 8406b09View commit details -
Configuration menu - View commit details
-
Copy full SHA for 64e64aa - Browse repository at this point
Copy the full SHA 64e64aaView commit details
There are no files selected for viewing