Skip to content
This repository has been archived by the owner on Jan 1, 2025. It is now read-only.

Division by zero caused by mask operation #243

Open
Chenyang-1024 opened this issue Jul 28, 2024 · 1 comment
Open

Division by zero caused by mask operation #243

Chenyang-1024 opened this issue Jul 28, 2024 · 1 comment

Comments

@Chenyang-1024
Copy link

Chenyang-1024 commented Jul 28, 2024

If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (
image

@q1556450920
Copy link

attn_mask[torch.where(attn_mask.sum(-1) == attn_mask.shape[-1])] = False Is there no such line?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants