Skip to content

Support free-threaded Python and ship 3.13t wheels #1767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ngoldbaum opened this issue Apr 21, 2025 · 6 comments · May be fixed by #1774
Open

Support free-threaded Python and ship 3.13t wheels #1767

ngoldbaum opened this issue Apr 21, 2025 · 6 comments · May be fixed by #1774

Comments

@ngoldbaum
Copy link

It looks like when I build tokenizers on a free-threaded Python, it chokes because of a dependency on rust-numpy 0.23.

If I manually update the dependencies on pyo3 and rust-numpy:

diff --git a/bindings/python/Cargo.toml b/bindings/python/Cargo.toml
index 6e8b0c34..3dfdf2ca 100644
--- a/bindings/python/Cargo.toml
+++ b/bindings/python/Cargo.toml
@@ -14,8 +14,8 @@ serde = { version = "1.0", features = ["rc", "derive"] }
 serde_json = "1.0"
 libc = "0.2"
 env_logger = "0.11"
-pyo3 = { version = "0.23", features = ["abi3", "abi3-py39", "py-clone"] }
-numpy = "0.23"
+pyo3 = { version = "0.24", features = ["abi3", "abi3-py39", "py-clone"] }
+numpy = "0.24"
 ndarray = "0.16"
 itertools = "0.12"
 
@@ -24,7 +24,7 @@ path = "../../tokenizers"
 
 [dev-dependencies]
 tempfile = "3.10"
-pyo3 = { version = "0.23", features = ["auto-initialize"] }
+pyo3 = { version = "0.24", features = ["auto-initialize"] }
 
 [features]
 defaut = ["pyo3/extension-module"]

Then everything builds. I didn't try running the tests or anything else beyond that.

I'm a PyO3 maintainer and helped initially add support for the free-threaded build in PyO3. Happy to help out with adding support here.

@electroglyph
Copy link

thanks for the details @ngoldbaum! i just came here to figure out what i needed to do for python 3.13t

@ArthurZucker
Copy link
Collaborator

Happy to update, and sorry did not have the time to do it yet! Will have a look

@Qubitium
Copy link

Qubitium commented May 2, 2025

@ngoldbaum @ArthurZucker Env is Ubuntu 24.04 + python3.13t

Even with latest pyo3 0.24.2 I cannot compile tokenizers on my local python3.13t env. Please help since I need this pkg to test hf transformers (dependency) with free threading. Thanks!

UPDATE: Fixed my compile error after updating rust to 1.86.0. So it appears we also need to have explicit rust version dependency set/check as well.

 upgraded to rust 1.86.0 from 1.76.0 usiong rust update
stable-x86_64-unknown-linux-gnu updated - rustc 1.86.0 (05f9846f8 2025-03-31) (from rustc 1.76.0 (07dca489a 2024-02-04))

UPDATE: Following is from rust 1.76.0. Upgrading rust to 1.86.0 resolved all errors.

(vm313t) root@gpu-base:~/tokenizers/bindings/python# pip install . -v
Using pip 25.1 from /root/vm313t/lib/python3.13t/site-packages/pip (python 3.13)
Processing /root/tokenizers/bindings/python
  Running command pip subprocess to install build dependencies
  Using pip 25.1 from /root/vm313t/lib/python3.13t/site-packages/pip (python 3.13)
  Collecting maturin<2.0,>=1.0
    Obtaining dependency information for maturin<2.0,>=1.0 from https://files.pythonhosted.org/packages/2e/6d/bf1b8bb9a8b1d9adad242b4089794be318446142975762d04f04ffabae40/maturin-1.8.3-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.musllinux_1_1_x86_64.whl.metadata
    Using cached maturin-1.8.3-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.musllinux_1_1_x86_64.whl.metadata (16 kB)
  Using cached maturin-1.8.3-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.musllinux_1_1_x86_64.whl (8.3 MB)
  Installing collected packages: maturin
  Successfully installed maturin-1.8.3
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  🍹 Building a mixed python/rust project
  🔗 Found pyo3 bindings with abi3 support for Python ≥ 3.9
  🐍 Not using a specific python interpreter
  📡 Using build options features, bindings from pyproject.toml
  tokenizers-0.21.2.dev0.dist-info
  Checking for Rust toolchain....
  Running `maturin pep517 write-dist-info --metadata-directory /tmp/pip-modern-metadata-lu43tsqe --interpreter /root/vm313t/bin/python3.13t`
  Preparing metadata (pyproject.toml) ... done
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/huggingface-hub/                                                                                      
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/huggingface-hub/                                                                                      
Collecting huggingface-hub<1.0,>=0.16.4 (from tokenizers==0.21.2.dev0)                                                                                               
  Obtaining dependency information for huggingface-hub<1.0,>=0.16.4 from https://files.pythonhosted.org/packages/93/27/1fb384a841e9661faad1c31cbfa62864f59632e876df5d795234da51c395/huggingface_hub-0.30.2-py3-none-any.whl.metadata
  Using cached huggingface_hub-0.30.2-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: filelock in /root/vm313t/lib/python3.13t/site-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0) (3.18.0)
Requirement already satisfied: fsspec>=2023.5.0 in /root/vm313t/lib/python3.13t/site-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0) (2025.3.2)
Collecting packaging>=20.9 (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for packaging>=20.9 from https://files.pythonhosted.org/packages/20/12/38679034af332785aac8774540895e234f4d07f7545804097de4b666afd8/packaging-25.0-py3-none-any.whl.metadata
  Using cached packaging-25.0-py3-none-any.whl.metadata (3.3 kB)
Collecting pyyaml>=5.1 (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Using cached pyyaml-6.0.2-cp313-cp313t-linux_x86_64.whl
Collecting requests (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for requests from https://files.pythonhosted.org/packages/f9/9b/335f9764261e915ed497fcdeb11df5dfd6f7bf257d4a6a2a686d80da4d54/requests-2.32.3-py3-none-any.whl.metadata
  Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting tqdm>=4.42.1 (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for tqdm>=4.42.1 from https://files.pythonhosted.org/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl.metadata
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /root/vm313t/lib/python3.13t/site-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0) (4.13.2)
Collecting charset-normalizer<4,>=2 (from requests->huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/0e/f6/65ecc6878a89bb1c23a086ea335ad4bf21a588990c3f535a227b9eea9108/charset_normalizer-3.4.1-py3-none-any.whl.metadata
  Using cached charset_normalizer-3.4.1-py3-none-any.whl.metadata (35 kB)
Collecting idna<4,>=2.5 (from requests->huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for idna<4,>=2.5 from https://files.pythonhosted.org/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl.metadata
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests->huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/6b/11/cc635220681e93a0183390e26485430ca2c7b5f9d33b15c74c2861cb8091/urllib3-2.4.0-py3-none-any.whl.metadata
  Using cached urllib3-2.4.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests->huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Obtaining dependency information for certifi>=2017.4.17 from https://files.pythonhosted.org/packages/4a/7e/3db2bd1b1f9e95f7cddca6d6e75e2f2bd9f51b1246e546d88addca0106bd/certifi-2025.4.26-py3-none-any.whl.metadata
  Using cached certifi-2025.4.26-py3-none-any.whl.metadata (2.5 kB)
Using cached huggingface_hub-0.30.2-py3-none-any.whl (481 kB)
Using cached packaging-25.0-py3-none-any.whl (66 kB)
Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Using cached charset_normalizer-3.4.1-py3-none-any.whl (49 kB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached urllib3-2.4.0-py3-none-any.whl (128 kB)
Using cached certifi-2025.4.26-py3-none-any.whl (159 kB)
Building wheels for collected packages: tokenizers
  Running command Building wheel for tokenizers (pyproject.toml)
  Running `maturin pep517 build-wheel -i /root/vm313t/bin/python3.13t --compatibility off`
  🍹 Building a mixed python/rust project
  🔗 Found pyo3 bindings with abi3 support for Python ≥ 3.9
  🐍 Not using a specific python interpreter
  📡 Using build options features, bindings from pyproject.toml
  ⚠️ Warning: CPython 3.13t at /root/vm313t/bin/python3.13t does not yet support abi3 so the build artifacts will be version-specific.
     Compiling tokenizers v0.21.2-dev.0 (/root/tokenizers/tokenizers)
  error[E0658]: use of unstable library feature 'lazy_cell'
   --> /root/tokenizers/tokenizers/src/normalizers/byte_level.rs:5:5
    |
  5 | use std::sync::LazyLock;
    |     ^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/normalizers/byte_level.rs:11:20
     |
  11 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/normalizers/byte_level.rs:11:50
     |
  11 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                                                  ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
   --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:2:5
    |
  2 | use std::sync::LazyLock;
    |     ^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:43:12
     |
  43 | static RE: LazyLock<SysRegex> = LazyLock::new(|| {
     |            ^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:43:33
     |
  43 | static RE: LazyLock<SysRegex> = LazyLock::new(|| {
     |                                 ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:47:20
     |
  47 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:47:50
     |
  47 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                                                  ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:48:20
     |
  48 | static CHAR_BYTES: LazyLock<HashMap<char, u8>> =
     |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:49:5
     |
  49 |     LazyLock::new(|| bytes_char().into_iter().map(|(c, b)| (b, c)).collect());
     |     ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
   --> /root/tokenizers/tokenizers/src/pre_tokenizers/whitespace.rs:1:5
    |
  1 | use std::sync::LazyLock;
    |     ^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/whitespace.rs:22:20
     |
  22 |         static RE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w+|[^\w\s]+").unwrap());
     |                    ^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/whitespace.rs:22:38
     |
  22 |         static RE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w+|[^\w\s]+").unwrap());
     |                                      ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
   --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:8:5
    |
  8 | use std::sync::LazyLock;
    |     ^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:98:26
     |
  98 | static STARTS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\w").unwrap());
     |                          ^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:98:44
     |
  98 | static STARTS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\w").unwrap());
     |                                            ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:99:24
     |
  99 | static ENDS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w$").unwrap());
     |                        ^^^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:99:42
     |
  99 | static ENDS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w$").unwrap());
     |                                          ^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:100:34
      |
  100 | static RIGHTMOST_SPACE_AT_START: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\s*").unwrap());
      |                                  ^^^^^^^^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:100:52
      |
  100 | static RIGHTMOST_SPACE_AT_START: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\s*").unwrap());
      |                                                    ^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:101:31
      |
  101 | static LEFTMOST_SPACE_AT_END: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\s*$").unwrap());
      |                               ^^^^^^^^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:101:49
      |
  101 | static LEFTMOST_SPACE_AT_END: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\s*$").unwrap());
      |                                                 ^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/normalizers/byte_level.rs:11:50
     |
  11 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                                                  ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:43:33
     |
  43 | static RE: LazyLock<SysRegex> = LazyLock::new(|| {
     |                                 ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:47:50
     |
  47 | static BYTES_CHAR: LazyLock<HashMap<u8, char>> = LazyLock::new(bytes_char);
     |                                                  ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/byte_level.rs:49:5
     |
  49 |     LazyLock::new(|| bytes_char().into_iter().map(|(c, b)| (b, c)).collect());
     |     ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/pre_tokenizers/whitespace.rs:22:38
     |
  22 |         static RE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w+|[^\w\s]+").unwrap());
     |                                      ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:98:44
     |
  98 | static STARTS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\w").unwrap());
     |                                            ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
    --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:99:42
     |
  99 | static ENDS_WITH_WORD: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\w$").unwrap());
     |                                          ^^^^^^^^^^^^^
     |
     = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:100:52
      |
  100 | static RIGHTMOST_SPACE_AT_START: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"^\s*").unwrap());
      |                                                    ^^^^^^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0658]: use of unstable library feature 'lazy_cell'
     --> /root/tokenizers/tokenizers/src/tokenizer/added_vocabulary.rs:101:49
      |
  101 | static LEFTMOST_SPACE_AT_END: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\s*$").unwrap());
      |                                                 ^^^^^^^^^^^^^
      |
      = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information

  error[E0599]: no method named `is_none_or` found for enum `std::option::Option` in the current scope
     --> /root/tokenizers/tokenizers/src/models/bpe/word.rs:204:22
      |
  202 |                   if merges
      |  ____________________-
  203 | |                     .get(&target_new_pair)
  204 | |                     .is_none_or(|(_, new_id)| *new_id != top.new_id)
      | |                     -^^^^^^^^^^ help: there is a method with a similar name: `is_none`
      | |_____________________|
      |

  error[E0599]: no method named `is_none_or` found for enum `std::option::Option` in the current scope
     --> /root/tokenizers/tokenizers/src/processors/template.rs:469:48
      |
  469 |         let pair_has_both = self.pair.as_ref().is_none_or(|pair| {
      |                             -------------------^^^^^^^^^^ help: there is a method with a similar name: `is_none`

  Some errors have detailed explanations: E0599, E0658.
  For more information about an error, try `rustc --explain E0599`.
  error: could not compile `tokenizers` (lib) due to 33 previous errors
  💥 maturin failed
    Caused by: Failed to build a native library through cargo
    Caused by: Cargo build finished with "exit status: 101": `env -u CARGO PYO3_ENVIRONMENT_SIGNATURE="cpython-3.13-64bit" PYO3_PYTHON="/root/vm313t/bin/python3.13t" PYTHON_SYS_EXECUTABLE="/root/vm313t/bin/python3.13t" "cargo" "rustc" "--features" "pyo3/extension-module" "--message-format" "json-render-diagnostics" "--manifest-path" "/root/tokenizers/bindings/python/Cargo.toml" "--release" "--lib"`
  Error: command ['maturin', 'pep517', 'build-wheel', '-i', '/root/vm313t/bin/python3.13t', '--compatibility', 'off'] returned non-zero exit status 1
  error: subprocess-exited-with-error
  
  × Building wheel for tokenizers (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /root/vm313t/bin/python3.13t /root/vm313t/lib/python3.13t/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpveyjk4vu                                                                                                                                                               
  cwd: /root/tokenizers/bindings/python
  Building wheel for tokenizers (pyproject.toml) ... error
  ERROR: Failed building wheel for tokenizers
Failed to build tokenizers                                                                                                                                           
ERROR: Failed to build installable wheels for some pyproject.toml based projects (tokenizers)

@szalpal
Copy link

szalpal commented May 8, 2025

@Qubitium ,

did you had any luck resolving the hf-xet dependency problem? When I'm trying to build tokenizers for 3.13t, I'm getting this error:

root@3747b124ab1e:/home# pip install git+https://github.com/Qubitium/tokenizers.git@pyo3-update#subdirectory=bindings/python
[...]
Collecting hf-xet<2.0.0,>=1.1.0 (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.21.2.dev0)
  Downloading hf_xet-1.1.0.tar.gz (263 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [343 lines of output]
          Updating crates.io index
       Downloading crates ...
        [...]
      💥 maturin failed
        Caused by: Failed to normalize python source path `python`
        Caused by: No such file or directory (os error 2)
      Error running maturin: Command '['maturin', 'pep517', 'write-dist-info', '--metadata-directory', '/tmp/pip-modern-metadata-wsrfquq_', '--interpreter', '/usr/bin/python']' returned non-zero exit status 1.
      Checking for Rust toolchain....
      Running `maturin pep517 write-dist-info --metadata-directory /tmp/pip-modern-metadata-wsrfquq_ --interpreter /usr/bin/python`

(The same build command works for 3.12)

@Qubitium
Copy link

Qubitium commented May 8, 2025

@szalpal I remember I had to upgrade rust/cargo to latest and also install maturin pkg afterwards. Then the build would complete without error.

@szalpal
Copy link

szalpal commented May 8, 2025

Unfortunately, this didn't help. Looks like the huggingface_hub release from 2 days ago introduced a problem. When I pinned huggingface_hub==0.30.2 here, build passed.

I created an issue in xet-core to know if my reasoning about this problem is correct: huggingface/xet-core#304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants