Olmo 0724 -hf
checkpoints don't load the proper config when instantiating with OLMoForCausalLM
#689
Labels
type/bug
An issue about a bug
🐛 Describe the bug
Hi, when I attempt to load a HF checkpoint as follows, there seems to be a config mismatch that prevents the checkpoint from loading (in general I'm not sure I understand the difference between the models ending in
-hf
those that are not, but I'd like to use intermediate checkpoints, which are currently only released for the 0424-hf
model).seems to be loading the OlmoConfig for a much smaller model:
Note that everything works as expected for the following commands:
I am assuming this has to do with the warning message "You are using a model of type olmo to instantiate a model of type hf_olmo. This is not supported for all configurations of models and can yield errors.", though notably I get the same warning message when running this command^, which does seemingly load the model.
Versions
Python 3.12.4
ai2-olmo==0.4.0
ai2-olmo-core==0.1.0
aiohappyeyeballs==2.3.4
aiohttp==3.10.0
aiosignal==1.3.1
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
attrs==23.2.0
boto3==1.34.152
botocore==1.34.152
cached_path==1.6.3
cachetools==5.4.0
certifi==2024.7.4
charset-normalizer==3.3.2
contourpy==1.2.1
cycler==0.12.1
datasets==2.20.0
dill==0.3.8
filelock==3.13.4
fonttools==4.53.1
frozenlist==1.4.1
fsspec==2024.5.0
google-api-core==2.19.1
google-auth==2.32.0
google-cloud-core==2.4.1
google-cloud-storage==2.18.0
google-crc32c==1.5.0
google-resumable-media==2.7.1
googleapis-common-protos==1.63.2
huggingface-hub==0.23.5
idna==3.7
importlib_resources==6.4.0
Jinja2==3.1.4
jmespath==1.0.1
jsonlines==4.0.0
kiwisolver==1.4.5
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.1
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
networkx==3.3
numpy==2.0.1
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.6.20
nvidia-nvtx-cu12==12.1.105
omegaconf==2.3.0
packaging==24.1
pandas==2.2.2
pillow==10.4.0
proto-plus==1.24.0
protobuf==5.27.3
pyarrow==17.0.0
pyarrow-hotfix==0.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pydantic==2.8.2
pydantic_core==2.20.1
Pygments==2.18.0
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
regex==2024.7.24
requests==2.32.3
rich==13.7.1
rsa==4.9
s3transfer==0.10.2
safetensors==0.4.3
setuptools==72.1.0
six==1.16.0
sympy==1.13.1
tokenizers==0.19.1
torch==2.3.1
tqdm==4.66.4
transformers==4.43.3
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
wheel==0.43.0
xxhash==3.4.1
yarl==1.9.4
The text was updated successfully, but these errors were encountered: