Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Works great on command line, but unable to use via python #15

Open
regstuff opened this issue Mar 2, 2024 · 3 comments
Open

Works great on command line, but unable to use via python #15

regstuff opened this issue Mar 2, 2024 · 3 comments

Comments

@regstuff
Copy link

regstuff commented Mar 2, 2024

Hi, Thanks for the great work. Hope this gets merged into llama.cpp, but till then, I'm able to get things to work in the command line. However, when running the python example, I get this error:

FileNotFoundError: Shared library with base name "bert" not found

I think I'm missing a package? I did the pip install requirements bit, so not sure what I'm getting wrong.

EDIT 1: Just noticed this has been merged into llama.cpp. For some reason I get an error when loading it into llama.cpp

llama_model_load: error loading model: error loading model hyperparameters: key not found in model: bert.context_length

This gguf was converted using bert.cpp. Does the original model have to be converted through llama.cpp?

EDIT 2: I see there's an issue with the embeddings implementation in llama.cpp

Also tried converting the model using llama.cpp convert.py but get this error:

Loading model file /home/sravanth/vecsearch/UAE-Large-V1/model.safetensors
Traceback (most recent call last):
  File "/home/sravanth/llama.cpp/convert.py", line 1483, in <module>
    main()
  File "/home/sravanth/llama.cpp/convert.py", line 1430, in main
    params = Params.load(model_plus)
  File "/home/sravanth/llama.cpp/convert.py", line 317, in load
    params = Params.loadHFTransformerJson(model_plus.model, hf_config_path)
  File "/home/sravanth/llama.cpp/convert.py", line 256, in loadHFTransformerJson
    f_norm_eps        = config["rms_norm_eps"],
KeyError: 'rms_norm_eps'
@iamlemec
Copy link
Owner

iamlemec commented Mar 2, 2024

Hi @regstuff! You'll want to use the convert-hf-to-gguf.py script in llama.cpp for embedding models. Sorry, this isn't super well documented currently. Even then, there is one issue with UAE-Large-V1 that it can't currently handle, which is that it doesn't specify a default pooling strategy. From the repo, it looks like you can actually choose which one you want at runtime, and they try out different types in the paper.

I'll try to get that conversion script working soon, but in the meantime, in BertModel, you can replace the set_gguf_parameters function with:

def set_gguf_parameters(self):
    super().set_gguf_parameters()
    self.gguf_writer.add_causal_attention(False)

    # get pooling path
    if (self.dir_model / "modules.json").is_file():
        with open(self.dir_model / "modules.json", encoding="utf-8") as f:
            modules = json.load(f)
        pooling_path = None
        for mod in modules:
            if mod["type"] == "sentence_transformers.models.Pooling":
                pooling_path = mod["path"]
                break
    else:
        pooling_path = None

    # get pooling type
    pooling_type = gguf.PoolingType.NONE
    if pooling_path is not None:
        with open(self.dir_model / pooling_path / "config.json", encoding="utf-8") as f:
            pooling = json.load(f)
        if pooling["pooling_mode_mean_tokens"]:
            pooling_type = gguf.PoolingType.MEAN
        elif pooling["pooling_mode_cls_token"]:
            pooling_type = gguf.PoolingType.CLS
        else:
            raise NotImplementedError("Only MEAN and CLS pooling types supported")
    else:
        pooling_type = gguf.PoolingType.CLS

    self.gguf_writer.add_pooling_type(pooling_type.value)

This will default to using last token pooling (see the third last line above). Hopefully we'll get the ability to choose a pooling strategy at runtime in llama.cpp soon.

@regstuff
Copy link
Author

regstuff commented Mar 3, 2024

Hi @iamlemec
Thanks for the reply. Will give this shot with llama.cpp.
Any clue where I'm going wrong with my first point. Pasting here for your convenience, when running the python example from the README in this repo, I get this error:

FileNotFoundError: Shared library with base name "bert" not found

I think I'm missing a package? I cloned the repo and pip installed requirements, so not sure what I'm getting wrong.
Works fine with the commandline btw.

@arlyle
Copy link

arlyle commented Mar 6, 2024

EDIT: Apologies, this solution applies to an outdated commit (47cb93d); the project structure has since changed.

@regstuff, follow these steps: (1) Build the project to get lib file. On macOS, it's build/libbert.dylib (or libbert.so on Linux). (2) Place build/libbert.dylib in the same directory as bert.py. (3) Update bert.py script:

-LIB_PATH = os.path.join(LIB_DIR, 'build/libbert.so')
+LIB_PATH = os.path.join(LIB_DIR, 'libbert.dylib')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants