Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How progen handles sequences greater than 2048? #44

Open
wenyuhaokikika opened this issue Jan 10, 2024 · 0 comments
Open

How progen handles sequences greater than 2048? #44

wenyuhaokikika opened this issue Jan 10, 2024 · 0 comments

Comments

@wenyuhaokikika
Copy link

I ran progen until I hit the first sequence greater than 2048 and it threw an exception:

2024-01-10 10:26:32.104 | INFO     | __main__:main:100 - falling back to cpu
2024-01-10 10:26:32.105 | WARNING  | __main__:main:105 - falling back to fp32
2024-01-10 10:26:32.105 | INFO     | __main__:main:107 - loading parameters
2024-01-10 10:26:38.012 | INFO     | __main__:main:111 - loading tokenizer
  0%|          | 0/10 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/public/home/wenyuhao/embedding/progen/progen2/embedding.py", line 137, in <module>
    main(args)
  File "/public/home/wenyuhao/embedding/progen/progen2/embedding.py", line 120, in main
    hidden_states,lm_logits = model.embedding(target) #.logits
  File "/public/home/wenyuhao/embedding/progen/progen2/models/progen/modeling_progen.py", line 700, in embedding
    transformer_outputs = self.transformer(
  File "/public/home/wenyuhao/embedding/progen/progen2/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/public/home/wenyuhao/embedding/progen/progen2/models/progen/modeling_progen.py", line 503, in forward
    outputs = block(
  File "/public/home/wenyuhao/embedding/progen/progen2/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/public/home/wenyuhao/embedding/progen/progen2/models/progen/modeling_progen.py", line 265, in forward
    attn_outputs = self.attn(
  File "/public/home/wenyuhao/embedding/progen/progen2/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/public/home/wenyuhao/embedding/progen/progen2/models/progen/modeling_progen.py", line 213, in forward
    attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
  File "/public/home/wenyuhao/embedding/progen/progen2/models/progen/modeling_progen.py", line 131, in _attn
    attn_weights = torch.where(causal_mask, attn_weights, self.masked_bias.to(attn_weights.dtype))
RuntimeError: The size of tensor a (2048) must match the size of tensor b (2073) at non-singleton dimension 3
  0%|          | 0/10 [00:08<?, ?it/s]%       

I found that when the sequence length is greater than 2048, the dimensions of query, key and mask do not match.

Can progen handle sequences larger than length 2048? If not, should I intercept the sequences beyond 2048 first and then enter them into the model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant