Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Install issue]: Chromadb and transformers require different versions of tokenizer. #3265

Closed
ayaazuddin opened this issue Dec 8, 2024 · 7 comments
Assignees
Labels
installation trouble trouble building or installing chroma

Comments

@ayaazuddin
Copy link

ayaazuddin commented Dec 8, 2024

What happened?

image

Upgrading tokenizer then gives me the same warning for Chromadb

Versions

chromadb-0.5.23
pip 24.3.1
python 3.12

Relevant log output

No response

@ayaazuddin ayaazuddin added the installation trouble trouble building or installing chroma label Dec 8, 2024
@Cirr0e
Copy link

Cirr0e commented Dec 17, 2024

Hey there! I see you're running into a dependency conflict with tokenizers. This is actually a known issue when using newer versions of chromadb with certain package combinations. Let me help you resolve this.

Here's what I recommend:

  1. First, create a fresh virtual environment with Python 3.10 (it has better compatibility with these packages):
python3.10 -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows
  1. Install the packages in this specific order:
pip install --upgrade pip
pip install tokenizers==0.22.0
pip install transformers
pip install chromadb==0.5.23

If you absolutely need to use Python 3.12, you can try this alternative approach:

pip install chromadb==0.5.23 --no-deps
pip install tokenizers==0.22.0
pip install -r requirements.txt  # if you have one

Important Notes:

  • Make sure to backup any existing data before making these changes
  • You might need to reinstall other dependencies after this
  • If you're using transformers, you may need to pin its version as well

The reason this happens is that chromadb and transformers both depend on tokenizers but sometimes require different versions. By installing tokenizers first with a compatible version, we can avoid the conflict.

Let me know if you run into any issues with these steps, and I'll help you troubleshoot further!

References:

  • Similar issue was resolved in chromadb issues with version pinning
  • Tokenizers compatibility matrix from their documentation

Would you like me to provide more specific guidance about any of these steps?

@sorgfresser
Copy link

Pretty sure it is fixed with #3322

@Vraised3
Copy link

Facing the same issue for the past few days; even after two weeks since this post.

Got the following errors when using normal pip install chromadb:

chromadb 0.5.23 requires tokenizers<=0.20.3,>=0.13.2
transformers 4.47.1 requires tokenizers<0.22,>=0.21

This makes it nearly impossible to use chromadb.

Tried @Cirr0e's solution:

pip install --upgrade pip
pip install tokenizers==0.22.0
pip install transformers
pip install chromadb==0.5.23

Didn't work:
transformers 4.47.1 requires tokenizers<0.22,>=0.21, but you have tokenizers 0.20.3 which is incompatible.

Maybe this required downgrading the transformers version, such that tokenizers is in line with transformers and chromadb requirements.

If anyone knows such a version, please comment.

@headbug
Copy link

headbug commented Dec 25, 2024

I'm confused, as of right now when I'm looking at the tokenizer project page, the latest release is still 0.21.0 (https://pypi.org/project/tokenizers/0.21.0/ )

Why is 0.22.0 put down in the solution above?!

@Dzalhaqi
Copy link

Dzalhaqi commented Dec 27, 2024

i can fix this by downgrade the tranformers to version 4.45.0
pip install transformers==4.45.0

my tokenizers version is 0.20.3 and chromadb version is 0.5.23

@Vraised3
Copy link

Vraised3 commented Dec 27, 2024

Yea for me the following config works for now:

pip install chromadb == 0.5.23
pip uninstall transformers
pip uninstall tokenizers
pip install transformers == 4.46.1
pip install tokenizers == 0.20.3

@itaismith
Copy link
Contributor

Hi all, we no longer have an upper bound on tokenizers in Chroma v0.6.0. You can use any tokenizers>=0.13.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation trouble trouble building or installing chroma
Projects
None yet
Development

No branches or pull requests

7 participants