Added extra Option to To control HF Token #898

nmoeller · 2024-09-18T06:46:52Z

fixes #830.

I've added a new option hf_token in the extra options, the user can pass a Token or True or False.
Also i upgraded use_auth_token to token.

src/python/py/models/builder.py

kunal-vaishnavi · 2024-09-18T19:41:59Z

Thank you for your contribution!

There are many ways that users can authenticate with Hugging Face.

Always logged in via Hugging Face CLI (comes installed with transformers): huggingface-cli login
Global environment variable: export HF_TOKEN=<token>
Per-run environment variable: HF_TOKEN=<token> python3 -m onnxruntime.models.builder <...>
Token in code: from_pretrained(token=<token>)
Boolean in code: from_pretrained(token=True)
Default in Hugging Face's code: from_pretrained(token=None)

There are also several scenarios for when the token is needed or not needed.

	Trust Remote Code = True	Trust Remote Code = False
Use Auth Token = True	Gated model, modeling file not implemented in installed `transformers` package	Gated model
Use Auth Token = False	Modeling file not implemented in installed `transformers` package	Local model

Given this information, I think we should change the behavior of how the Hugging Face token is obtained to the following.

# For the `Model` class
self.hf_token = parse_hf_token(extra_options.get("hf_token", "true"))

# For `create_model`
hf_token = parse_hf_token(extra_options.get("hf_token", "true"))

def parse_hf_token(hf_token):
   """
   Returns the authentication token needed for Hugging Face.
   Token is obtained either from the user or the environment.
   """
    if hf_token.lower() in {"false", "0"}:
        # Default is `None` for disabling authentication
        return None

    if hf_token.lower() in {"true", "1"}:
        # Return token in environment
        return os.environ["HF_TOKEN"]

    # Return user-provided token as string
    return hf_token

Then, the two options for hf_token would be false or the user-provided token. While not officially supported, the "0" and "1" flags are also checked because some users don't follow the documentation.

The hf_token flag would allow us to disable authentication with Hugging Face or provide a custom authentication token that differs from the one stored in the user's environment. When not using the flag, the default behavior would be to use the authentication token stored by huggingface-cli login in the HF_TOKEN environment variable.

What do you think?

Co-authored-by: kunal-vaishnavi <[email protected]>

…er/onnxruntime-genai into features/disable_hf_login

nmoeller · 2024-09-25T07:51:36Z

You are absolutely right and your feedback is very appreciated ! I implemented the changes you proposed.
Sorry for my missing intelligence to not have the idea by my own 😄

@kunal-vaishnavi are we sure that the huggingface-cli login set the HF_TOKEN variable ? I think the token is stored locally in a file or ?

I think we should return true from the method instead of the env variable ? Because when using from_pretrained(token=True) it will check for a local token set by huggingface-cli login and if no token is set it will use the HF_TOKEN automatically or ?

def parse_hf_token(hf_token):
   """
   Returns the authentication token needed for Hugging Face.
   Token is obtained either from the user or the environment.
   """
    if hf_token.lower() in {"false", "0"}:
        # Default is `None` for disabling authentication
        return None

    if hf_token.lower() in {"true", "1"}:
        # Return token in true
        return True

    # Return user-provided token as string
    return hf_token

kunal-vaishnavi · 2024-09-25T08:38:38Z

You are absolutely right and your feedback is very appreciated ! I implemented the changes you proposed.
Sorry for my missing intelligence to not have the idea by my own 😄

No worries and happy to review!

@kunal-vaishnavi are we sure that the huggingface-cli login set the HF_TOKEN variable ? I think the token is stored locally in a file or ?
I think we should return true from the method instead of the env variable ? Because when using from_pretrained(token=True) it will check for a local token set by huggingface-cli login and if no token is set it will use the HF_TOKEN automatically or ?

Upon further testing, it appears that the environment variables are not always set. I agree that returning True instead makes sense since the from_pretrained(token=True) command should pick up the token saved locally by huggingface-cli login.

According to Hugging Face's documentation, the HF_TOKEN environment variable can override the token saved at $HF_HOME/token, which is where huggingface-cli login saves the token. If a user wants to use a different token and doesn't want to pass it to --hf_token, then the user can either login with a different token via huggingface-cli login or use a per-run environment variable (e.g. HF_TOKEN=<token> python3 -m onnxruntime.models.builder <...>).

src/python/py/models/builder.py

Co-authored-by: kunal-vaishnavi <[email protected]>

kunal-vaishnavi · 2024-09-25T17:03:13Z

In the README, can you add a new section in "Extra Options" after "Use 8 Bits Quantization in QMoE" to show how to use this new extra option?

onnxruntime-genai/src/python/py/models/README.md

Lines 176 to 186 in 1764869

    
           #### Use 8 Bits Quantization in QMoE 
        
           This scenario is for when you want to use 8-bit quantization for MoE layers. Default is using 4-bit quantization. 
        
           ``` 
        
           # From wheel: 
        
           python3 -m onnxruntime_genai.models.builder -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options use_8bits_moe=1 
        
           # From source: 
        
           python3 builder.py -i path_to_local_folder_on_disk -o path_to_output_folder -p precision -e execution_provider -c cache_dir_to_store_temp_files --extra_options use_8bits_moe=1 
        
           ```

Once added, can you also add a reference to the new section in the table of contents after "Use 8 Bits Quantization in QMoE"?

onnxruntime-genai/src/python/py/models/README.md

Lines 15 to 20 in 1764869

    
           - [Extra Options](#extra-options) 
        
             - [Config Only](#config-only) 
        
             - [Exclude Embedding Layer](#exclude-embedding-layer) 
        
             - [Exclude Language Modeling Head](#exclude-language-modeling-head) 
        
             - [Enable Cuda Graph](#enable-cuda-graph) 
        
             - [Use 8 Bits Quantization in QMoE](#use-8-bits-quantization-in-qmoe)

…er/onnxruntime-genai into features/disable_hf_login

src/python/py/models/README.md

kunal-vaishnavi · 2024-09-29T02:25:03Z

This PR looks good to me now! It appears you need to resolve two conflicting files as those files were recently updated by another PR. Once those conflicts are resolved, I can approve and merge your changes!

…ogin

nmoeller · 2024-09-30T06:46:35Z

I merged the changes, i was not sure what is your desired order for the chapters in the Readme.md.
If i should change the order let me know 👍

nmoeller and others added 2 commits September 18, 2024 08:16

Added extra Option to disable HF Auth

e0c7ee1

added hf token as extra option

bd1a057

nmoeller marked this pull request as ready for review September 18, 2024 06:47

nmoeller marked this pull request as draft September 18, 2024 06:48

nmoeller marked this pull request as ready for review September 18, 2024 06:51

fixed boolean to string conversion

8c70f3c

kunal-vaishnavi reviewed Sep 18, 2024

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Sep 18, 2024

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Sep 18, 2024

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

nmoeller and others added 5 commits September 25, 2024 09:31

Update src/python/py/models/builder.py

204edc1

Co-authored-by: kunal-vaishnavi <[email protected]>

Update src/python/py/models/builder.py

9de4d1e

Co-authored-by: kunal-vaishnavi <[email protected]>

Update src/python/py/models/builder.py

ccbbea6

Co-authored-by: kunal-vaishnavi <[email protected]>

implemented PR Feedback

932d1a0

Merge branch 'features/disable_hf_login' of https://github.com/nmoell…

70d083e

…er/onnxruntime-genai into features/disable_hf_login

switching parse HF to True

224c139

kunal-vaishnavi reviewed Sep 25, 2024

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Sep 25, 2024

View reviewed changes

src/python/py/models/builder.py Outdated Show resolved Hide resolved

nmoeller and others added 2 commits September 25, 2024 11:28

Update src/python/py/models/builder.py

a35a116

Co-authored-by: kunal-vaishnavi <[email protected]>

Update src/python/py/models/builder.py

b182458

Co-authored-by: kunal-vaishnavi <[email protected]>

nmoeller requested a review from kunal-vaishnavi September 25, 2024 11:25

nmoeller added 2 commits September 27, 2024 07:58

added documentation

753ebeb

Merge branch 'features/disable_hf_login' of https://github.com/nmoell…

20a3fa1

…er/onnxruntime-genai into features/disable_hf_login

kunal-vaishnavi reviewed Sep 27, 2024

View reviewed changes

src/python/py/models/README.md Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Sep 27, 2024

View reviewed changes

src/python/py/models/README.md Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Sep 27, 2024

View reviewed changes

src/python/py/models/README.md Show resolved Hide resolved

integrated PR Feedback

0aad2af

this markdown extension....

ca01554

Merge remote-tracking branch 'origin/main' into features/disable_hf_l…

0cf09b9

…ogin

nmoeller requested a review from kunal-vaishnavi September 30, 2024 16:02

kunal-vaishnavi approved these changes Sep 30, 2024

View reviewed changes

kunal-vaishnavi merged commit 4332f98 into microsoft:main Sep 30, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added extra Option to To control HF Token #898

Added extra Option to To control HF Token #898

nmoeller commented Sep 18, 2024

kunal-vaishnavi commented Sep 18, 2024

nmoeller commented Sep 25, 2024 •

edited

Loading

kunal-vaishnavi commented Sep 25, 2024

kunal-vaishnavi commented Sep 25, 2024

kunal-vaishnavi commented Sep 29, 2024

nmoeller commented Sep 30, 2024

Added extra Option to To control HF Token #898

Added extra Option to To control HF Token #898

Conversation

nmoeller commented Sep 18, 2024

kunal-vaishnavi commented Sep 18, 2024

nmoeller commented Sep 25, 2024 • edited Loading

kunal-vaishnavi commented Sep 25, 2024

kunal-vaishnavi commented Sep 25, 2024

kunal-vaishnavi commented Sep 29, 2024

nmoeller commented Sep 30, 2024

nmoeller commented Sep 25, 2024 •

edited

Loading