Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the formula calculating c value #39

Open
ghost opened this issue Mar 26, 2021 · 10 comments
Open

Question about the formula calculating c value #39

ghost opened this issue Mar 26, 2021 · 10 comments

Comments

@ghost
Copy link

ghost commented Mar 26, 2021

In the paper,

image

what does this formula mean? softmax gives a vector and h_i is also a vector. what does mulitplication of softmax and h_I means?

@ChanganVR
Copy link
Contributor

hi @guldamkwak3114 we meant to normalize the scores alpha_i and do weighted sum of all h_i. Sorry for the bad and confusing writing.

@ghost
Copy link
Author

ghost commented Mar 30, 2021

What do you mean by normalize the scores alpha_i ?

@ChanganVR
Copy link
Contributor

Softmax is the normalization operation.

@ghost
Copy link
Author

ghost commented Mar 30, 2021

Still confuse me. softmax(alpha_i) and h_i are both vectors.................... how can we multiply them?

@ghost
Copy link
Author

ghost commented Mar 30, 2021

do you mean dot product of softmax(alpha_i) and h_i ?

@ChanganVR
Copy link
Contributor

The softmax(alpha_i) here means doing softmax for all scores and take the ith component, which is just a single value. So we normalize the score with softmax for all neighbors and do a weighted sum of each neighbor's interaction feature.

@ghost
Copy link
Author

ghost commented Apr 1, 2021

alpha_i itself is a vector. then do you mean take softmax of (a_1,a_2, .... ,a_n) and get the i-th component?

Thanks

@ChanganVR
Copy link
Contributor

Is alpha_i a vecotr? alpha_i is the score for each pair right?

take softmax of (a_1,a_2, .... ,a_n) and get the i-th component?
This is right.

@ghost
Copy link
Author

ghost commented Apr 2, 2021

according to this alpha_i should be length 100 vector. Am i miss-reading something?
image

@ChanganVR
Copy link
Contributor

The hidden units refers to all the MLP layers up to but not including the last layer. The last layer outputs one single value for each pair/human as the attention score. The corresponding code:

scores = self.attention(attention_input).view(size[0], size[1], 1).squeeze(dim=2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant