Truncation not explicitly mention #813

udbhav-44 · 2024-06-30T05:28:41Z

I get this error when i Try to run a query

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
Setting pad_token_id to eos_token_id:128001 for open-end generation.
C:\Users\Tarun Sridhar.conda\envs\mummy\lib\site-packages\transformers\models\llama\modeling_llama.py:648: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
attn_output = torch.nn.functional.scaled_dot_product_attention(

What can be possible fixes?

The text was updated successfully, but these errors were encountered:

GregChiang0201 · 2024-07-18T06:26:39Z

I also try to run a query face the same problem, but the system only shows "Setting pad_token_id to eos_token_id:128001 for open-end generation.", have you ever solve the problem yet, pls help.

KansaiTraining · 2024-07-25T08:19:59Z

I got the same message and the query takes forever...
Any explanation of the error and if it has influence on the query results?

GregChiang0201 · 2024-07-25T09:03:42Z

I find the problem is, this author build the program in serial, instead of parallel, while you compile run_localGPT, you can also monitor you CPU usage(by top, or htop instructions). In my aspect, I only utilize 1~2 cpu cores to run the program, that’s the reason why it run so slow.

…

On Jul 25, 2024, at 4:20 PM, KansaiTraining ***@***.***> wrote: I got the same message and the query takes forever... Any explanation of the error and if it has influence on the query results? — Reply to this email directly, view it on GitHub <#813 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BJC5EIKMTLAJJ6K2KGHHCG3ZOCYMJAVCNFSM6AAAAABKDZOUTCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBZG42DGMJXGM>. You are receiving this because you commented.

maxrmp · 2024-08-09T15:27:01Z

Same issue here... I also see my SSD reading a lot because of python 3.10, even after getting :

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
Setting pad_token_id to eos_token_id:128001 for open-end generation.

Has anyone found a solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Truncation not explicitly mention #813

Truncation not explicitly mention #813

udbhav-44 commented Jun 30, 2024

GregChiang0201 commented Jul 18, 2024

KansaiTraining commented Jul 25, 2024

GregChiang0201 commented Jul 25, 2024 via email

maxrmp commented Aug 9, 2024

Truncation not explicitly mention #813

Truncation not explicitly mention #813

Comments

udbhav-44 commented Jun 30, 2024

GregChiang0201 commented Jul 18, 2024

KansaiTraining commented Jul 25, 2024

GregChiang0201 commented Jul 25, 2024 via email

maxrmp commented Aug 9, 2024