Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plamo-2-1b model #1283

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

mitmul
Copy link
Contributor

@mitmul mitmul commented Feb 13, 2025

This PR adds the latest SLM from Preferred Networks, PLaMo-2-1B.

Currently it can successfully generate correct responses when I use the following code:

import mlx.core as mx
from mlx_lm.utils import load
from mlx_lm.models import cache

model, tokenizer = load(
    "../plamo-2-1b-mlx",
    tokenizer_config={
        "trust_remote_code": True,
    },
    model_config={
        "trust_remote_code": True,
    },
)

user_input = "美味しいカレーの作り方のレシピを紹介します。"

input_ids = tokenizer.encode(user_input)
input_ids = mx.array(input_ids, dtype=mx.int64)

y = input_ids
prompt_cache = cache.make_prompt_cache(model)

for _ in range(100):
    logits = model(inputs=y[None], cache=prompt_cache)
    logits = logits[:, -1, :]
    logprobs = logits - mx.logsumexp(logits, keepdims=True)
    next_y = mx.argmax(logprobs, axis=-1)

    response = tokenizer.decode(next_y.item())
    print(response, end="")
    y = mx.concat([y, mx.array(next_y)])

However, when I run the following command from my terminal:

python -m mlx_lm.generate \
--model plamo-2-1b-mlx \
--prompt '美味しいカレーの作り方のレシピを紹介します。' \
--verbose true \
--max-tokens 128 \
--ignore-chat-template

the outputs are totally corrupted.
Now I'm investigating the cause of why mlx_lm.generate doesn't work for this model.
I guess it relates to the cache implementation in this code.

@mitmul
Copy link
Contributor Author

mitmul commented Feb 14, 2025

I found a bug in my code that performs causal-conv1d update, which was related to the difference between the channel-first conv1d in torch and channel-last conv1d in MLX. I fixed it and confirmed that it runs properly as expected, so I changed this PR's state to review-ready 🙏

@mitmul mitmul marked this pull request as ready for review February 14, 2025 14:32
@mitmul mitmul changed the title [WIP] Add plamo-2-1b model Add plamo-2-1b model Feb 14, 2025
@mitmul mitmul force-pushed the mitmul/add-plamo2-1b-support branch from d3f9be0 to 1e75bf1 Compare February 15, 2025 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant