Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to execute the generate.py script - You need to have sentencepiece installed to convert a slow tokenizer to a fast one. #4

Closed
nnganesha opened this issue Apr 29, 2024 · 2 comments

Comments

@nnganesha
Copy link

nnganesha commented Apr 29, 2024

Hi,

I first ran the command python generate.py --base_model 'budecosystem/genz-13b-v2'.

For this, I got the below error:

ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format.

I googled and found some solution to add the last 2 lines (offload parameters).

offload_folder="offload",
offload_state_dict=True,

model = AutoModelForCausalLM.from_pretrained(
base_model,
load_in_8bit=load_8bit,
torch_dtype=torch.float16,
device_map="auto",
offload_folder="offload",
offload_state_dict=True,
)

After making this change and running the same command, I am getting below error.

python generate.py --base_model 'budecosystem/genz-13b-v2'
/Users/userid/Library/Python/3.9/lib/python/site-packages/urllib3/init.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: urllib3/urllib3#3020
warnings.warn(
Loading checkpoint shards: 100%|

WARNING:root:Some parameters are on the meta device device because they were offloaded to the disk.
Traceback (most recent call last):
File "/Users/userid/BudEcosystem/GenZ/generate.py", line 107, in
fire.Fire(main)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/userid/BudEcosystem/GenZ/generate.py", line 37, in main
tokenizer = AutoTokenizer.from_pretrained(base_model)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/transformers/models/auto/tokenization_auto.py", line 862, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained
return cls._from_pretrained(
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 124, in init
super().init(
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/transformers/tokenization_utils_fast.py", line 120, in init
raise ValueError(
ValueError: Couldn't instantiate the backend tokenizer from one of:

(1) a tokenizers library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

Is this related to RAM/CPU of my MacBook? Can someone help please?

Regards,
Ganesh.

@nnganesha
Copy link
Author

nnganesha commented Apr 29, 2024

Update: Above error got resolved by installing sentencepiece using pip install. But got an error with protobuf. after installing protobuf, getting below error.

Traceback (most recent call last):
File "/Users/userid/BudEcosystem/GenZ/generate.py", line 107, in
fire.Fire(main)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/Users/userid/Library/Python/3.9/lib/python/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/userid/BudEcosystem/GenZ/generate.py", line 97, in main
gr.inputs.Textbox(
AttributeError: module 'gradio' has no attribute 'inputs'.

@nnganesha
Copy link
Author

Finally got it running by downgrading gradio to 3.50
pip install gradio==3.50

tloen/alpaca-lora#605

Thanks to above link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant