You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great job ! Thank you for sharing it.
Do you have explanation why you use 2 sets of attention weights (visits and variables) instead of only one for variables ?
With this set you can still get a visit contribution using aggregating method, average or sum of the variable weights of each visit for instance
Thanks in advance for your help
The text was updated successfully, but these errors were encountered:
Thanks for taking interest in our work.
It's an interesting question as a few other researchers asked the same question to me as well.
You can totally do what you suggested (i.e. using only code-level attention, then aggregating them).
For example, you can just use a single LSTM to encode a sequence of codes (no visits, but just a sequence of codes), and apply attention on top of them. But this way, you lose the visit-level information (i.e. which codes belong to the same visit).
The more interesting alternative would be using the RETAIN architecture, but remove the visit-level attention component. This way, you still tell the model which codes belong to the same visit. I am actually quite curious how this would turn out :)
If you happen to run this experiment, please share your results with everyone.
Hello retain team,
Great job ! Thank you for sharing it.
Do you have explanation why you use 2 sets of attention weights (visits and variables) instead of only one for variables ?
With this set you can still get a visit contribution using aggregating method, average or sum of the variable weights of each visit for instance
Thanks in advance for your help
The text was updated successfully, but these errors were encountered: