Add plamo-2-1b model #1283

mitmul · 2025-02-13T11:05:25Z

This PR adds the latest SLM from Preferred Networks, PLaMo-2-1B.

Currently it can successfully generate correct responses when I use the following code:

import mlx.core as mx
from mlx_lm.utils import load
from mlx_lm.models import cache

model, tokenizer = load(
    "../plamo-2-1b-mlx",
    tokenizer_config={
        "trust_remote_code": True,
    },
    model_config={
        "trust_remote_code": True,
    },
)

user_input = "美味しいカレーの作り方のレシピを紹介します。"

input_ids = tokenizer.encode(user_input)
input_ids = mx.array(input_ids, dtype=mx.int64)

y = input_ids
prompt_cache = cache.make_prompt_cache(model)

for _ in range(100):
    logits = model(inputs=y[None], cache=prompt_cache)
    logits = logits[:, -1, :]
    logprobs = logits - mx.logsumexp(logits, keepdims=True)
    next_y = mx.argmax(logprobs, axis=-1)

    response = tokenizer.decode(next_y.item())
    print(response, end="")
    y = mx.concat([y, mx.array(next_y)])

However, when I run the following command from my terminal:

python -m mlx_lm.generate \
--model plamo-2-1b-mlx \
--prompt '美味しいカレーの作り方のレシピを紹介します。' \
--verbose true \
--max-tokens 128 \
--ignore-chat-template

the outputs are totally corrupted.
Now I'm investigating the cause of why mlx_lm.generate doesn't work for this model.
I guess it relates to the cache implementation in this code.

mitmul · 2025-02-14T14:32:22Z

I found a bug in my code that performs causal-conv1d update, which was related to the difference between the channel-first conv1d in torch and channel-last conv1d in MLX. I fixed it and confirmed that it runs properly as expected, so I changed this PR's state to review-ready 🙏

mitmul marked this pull request as ready for review February 14, 2025 14:32

mitmul changed the title ~~[WIP] Add plamo-2-1b model~~ Add plamo-2-1b model Feb 14, 2025

mitmul added 15 commits February 15, 2025 07:04

Add pfnet/plamo-2-1b

58686bb

Fix cache.py to support non-top level layers

9a6e654

Use mlx's BaseModelArgs

40c7ce8

Fix model

197fd6a

Use sanitize()

72269c3

Remove unnecessary changes

ebea692

Add plamo2.py

07cf433

Apply formatter

fb5e225

Fix some part

00d13eb

Allow a cache obj defined externally

81917d4

Fix channel first weights to channel last for right use of MLX's conv1d

66dd97e

Remove unused code part

103c661

Give all inputs when it's the first time call of model

28f3f3a

Fix import

9f422b4

Include .jsonl files to download from Huggingface hub

1e75bf1

mitmul force-pushed the mitmul/add-plamo2-1b-support branch from d3f9be0 to 1e75bf1 Compare February 15, 2025 01:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add plamo-2-1b model #1283

Add plamo-2-1b model #1283

mitmul commented Feb 13, 2025 •

edited

Loading

mitmul commented Feb 14, 2025

Add plamo-2-1b model #1283

Are you sure you want to change the base?

Add plamo-2-1b model #1283

Conversation

mitmul commented Feb 13, 2025 • edited Loading

mitmul commented Feb 14, 2025

mitmul commented Feb 13, 2025 •

edited

Loading