Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while performing Inference with llama-3.2 3B checkpoint #1749

Open
Vattikondadheeraj opened this issue Oct 3, 2024 · 2 comments
Open

Comments

@Vattikondadheeraj
Copy link

Hey,

I am trying to perform inference with llama-3.2 3B Instruct checkpoint but I am facing some errors. I didn't go deep into the code or try to debug yet. I am posting here in a hope to get quick resolution. I have attached the error below

DEBUG:torchtune.utils._logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
Traceback (most recent call last):
  File "/home/toolkit/.conda/envs/torch/bin/tune", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/tune.py", line 49, in main
    parser.run(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/tune.py", line 43, in run
    args.func(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/run.py", line 187, in _run_cmd
    self._run_single_device(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/run.py", line 96, in _run_single_device
    runpy.run_path(str(args.recipe), run_name="__main__")
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/recipes/generate.py", line 211, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/config/_parse.py", line 99, in wrapper
    sys.exit(recipe_main(conf))
             ^^^^^^^^^^^^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/recipes/generate.py", line 206, in main
    recipe.setup(cfg=cfg)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/recipes/generate.py", line 55, in setup
    self._model = self._setup_model(
                  ^^^^^^^^^^^^^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/recipes/generate.py", line 73, in _setup_model
    model.load_state_dict(model_state_dict)
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TransformerDecoder:
        Unexpected key(s) in state_dict: "output.weight".

Heres in my config file

# Config for running the InferenceRecipe in generate.py to generate output from an LLM
#
# To launch, run the following command from root torchtune directory:
#    tune run generate --config generation

# Model arguments
model:
  _component_: torchtune.models.llama3_2.llama3_2_3b

checkpointer:
  _component_: torchtune.training.FullModelMetaCheckpointer
  checkpoint_dir: /home/toolkit/scratch/LLMcode/Checkpoints/Fine_tuning_models-3B/output-0.5
  checkpoint_files: [
    meta_model_0.pt,
  ]
  output_dir: /tmp/Llama-2-7b-hf/
  model_type: LLAMA2

device: cuda
dtype: bf16

seed: 1234

# Tokenizer arguments
tokenizer:
  _component_: torchtune.models.llama3.llama3_tokenizer
  path: /home/toolkit/scratch/LLMcode/Train/llama-3.2-3B/original/tokenizer.model
  max_seq_len: null

# Generation arguments; defaults taken from gpt-fast
prompt: "Hey, tell me a joke"
instruct_template: null
chat_format: null
max_new_tokens: 300
temperature: 0.0 # 0.8 and 0.6 are popular values to try
top_k: 300

enable_kv_cache: False

quantizer: null
@SalmanMohammadi
Copy link
Collaborator

Hey @Vattikondadheeraj. One quick thing to try, could you change model_type: LLAMA3?

I'd also reccomend giving our dev/generate_v2 recipe out, which we'll be switching to as our default generation recipe soon. There's an example config in recipes/configs/llama2/generation_v2.yaml.

@Vattikondadheeraj
Copy link
Author

Hey @SalmanMohammadi , I tested on generate_v2 file but I am getting another error which is related to missing thing in config file I guess. I attached the error below

Traceback (most recent call last):
  File "/home/toolkit/.conda/envs/torch/bin/tune", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/tune.py", line 49, in main
    parser.run(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/tune.py", line 43, in run
    args.func(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/run.py", line 187, in _run_cmd
    self._run_single_device(args)
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/_cli/run.py", line 96, in _run_single_device
    runpy.run_path(str(args.recipe), run_name="__main__")
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "generate_v2.py", line 247, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/toolkit/scratch/LLMcode/Train/torchtune-2/torchtune/torchtune/config/_parse.py", line 99, in wrapper
    sys.exit(recipe_main(conf))
             ^^^^^^^^^^^^^^^^^
  File "generate_v2.py", line 235, in main
    recipe = InferenceRecipe(cfg=cfg)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "generate_v2.py", line 78, in __init__
    self._logger = utils.get_logger(cfg.log_level)
                                    ^^^^^^^^^^^^^
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/dictconfig.py", line 355, in __getattr__
    self._format_and_raise(
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/base.py", line 231, in _format_and_raise
    format_and_raise(
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
    _raise(ex, cause)
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/_utils.py", line 797, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set env var OC_CAUSE=1 for full trace
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/dictconfig.py", line 351, in __getattr__
    return self._get_impl(
           ^^^^^^^^^^^^^^^
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/dictconfig.py", line 442, in _get_impl
    node = self._get_child(
           ^^^^^^^^^^^^^^^^
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/basecontainer.py", line 73, in _get_child
    child = self._get_node(
            ^^^^^^^^^^^^^^^
  File "/home/toolkit/.conda/envs/torch/lib/python3.11/site-packages/omegaconf/dictconfig.py", line 480, in _get_node
    raise ConfigKeyError(f"Missing key {key!s}")
omegaconf.errors.ConfigAttributeError: Missing key log_level
    full_key: log_level
    object_type=dict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants