1604KWWeights

Keyword Weights as Inputs

YodaQA type of the anssel task datasets includes additional feature for the input pairs - weights of keywords and about-keywords of s0 matched in s1.

They are pretty strong predictors on their own (curatedv2 devMRR 0.337348, large2470 devMRR 0.318246).

TODO: We could also augment this with (or use only just...) BM25 weights. That could work for other datasets as well, and is an alternative use for the prescoring logic.

KWWeights

Baselines (we did these measurements with the vocabcase setting):

8x R_ay_3rnn - 0.419903 (95% [0.399927, 0.439880])

4x R_al_3rnn - 0.395602 (95% [0.383595, 0.407609])

4x R_al_3a51 - 0.404151 (95% [0.382397, 0.425904])

8x R_ay_3rnn_kw - 0.452198 (95% [0.436496, 0.467899]):

10884109.arien.ics.muni.cz.R_ay_3rnn_kw etc.
[0.467730, 0.466489, 0.458678, 0.480130, 0.427241, 0.423624, 0.452207, 0.441481, ]

4x R_al_3rnn_kw - 0.411832 (95% [0.388420, 0.435244]):

10884136.arien.ics.muni.cz.R_al_3rnn_kw etc.
[0.400349, 0.424932, 0.427774, 0.394274, ]

4x R_al_3a51_kw - 0.465138 (95% [0.461127, 0.469148]):

10884138.arien.ics.muni.cz.R_al_3a51_kw etc.
[0.465793, 0.468988, 0.462912, 0.462857, ]

KWWeights on master

Wrt. the master baseliens:

8x R_ay_2rnn_kw - 0.470143 (95% [0.444607, 0.495678]):

10911926.arien.ics.muni.cz.R_ay_2rnn_kw etc.
[0.432749, 0.442759, 0.479331, 0.504750, 0.480979, 0.422751, 0.501615, 0.496206, ]

4x R_al_2rnn_kw - 0.423874 (95% [0.406368, 0.441380]):

10911924.arien.ics.muni.cz.R_al_2rnn_kw etc.
[0.418824, 0.418950, 0.442729, 0.414993, ]

4x R_al_2a51_kw - 0.457016 (95% [0.434550, 0.479482]):

10930683.arien.ics.muni.cz.R_al_2a51_kw etc.
[0.469147, 0.470216, 0.453156, 0.435544, ]

Zero-dropout experiment

8x R_ay_2rnnd0_kw - 0.434593 (95% [0.420952, 0.448234]):

10911927.arien.ics.muni.cz.R_ay_2rnnd0_kw etc.
[0.446196, 0.467435, 0.432201, 0.426130, 0.432023, 0.435823, 0.405995, 0.430943, ]

4x R_al_2rnnd0_kw - 0.446685 (95% [0.438853, 0.454517]):

10911925.arien.ics.muni.cz.R_al_2rnnd0_kw etc.
[0.439836, 0.452840, 0.444595, 0.449469, ]

Same trend as with Ubuntu - with large dataset, dropout advantage tapers off.

Zero-dropout-zero-L2reg experiment

TODO transfer learning check

4x R_al_2rnnd0L0_kw - 0.442456 (95% [0.431101, 0.453812]):

10930681.arien.ics.muni.cz.R_al_2rnnd0L0_kw etc.
[0.440065, 0.441733, 0.453828, 0.434200, ]

4x R_al_2a51d0L0_kw - 0.441782 (95% [0.415093, 0.468470]):

10930684.arien.ics.muni.cz.R_al_2a51d0L0_kw etc.
[0.467851, 0.422707, 0.432930, 0.443639, ]

Provide feedback

Saved searches