-
Notifications
You must be signed in to change notification settings - Fork 205
1604KWWeights
YodaQA type of the anssel task datasets includes additional feature for the input pairs - weights of keywords and about-keywords of s0 matched in s1.
They are pretty strong predictors on their own (curatedv2 devMRR 0.337348, large2470 devMRR 0.318246).
TODO: We could also augment this with (or use only just...) BM25 weights. That could work for other datasets as well, and is an alternative use for the prescoring logic.
Baselines (we did these measurements with the vocabcase setting):
8x R_ay_3rnn - 0.419903 (95% [0.399927, 0.439880])
4x R_al_3rnn - 0.395602 (95% [0.383595, 0.407609])
4x R_al_3a51 - 0.404151 (95% [0.382397, 0.425904])
8x R_ay_3rnn_kw - 0.452198 (95% [0.436496, 0.467899]):
10884109.arien.ics.muni.cz.R_ay_3rnn_kw etc.
[0.467730, 0.466489, 0.458678, 0.480130, 0.427241, 0.423624, 0.452207, 0.441481, ]
4x R_al_3rnn_kw - 0.411832 (95% [0.388420, 0.435244]):
10884136.arien.ics.muni.cz.R_al_3rnn_kw etc.
[0.400349, 0.424932, 0.427774, 0.394274, ]
4x R_al_3a51_kw - 0.465138 (95% [0.461127, 0.469148]):
10884138.arien.ics.muni.cz.R_al_3a51_kw etc.
[0.465793, 0.468988, 0.462912, 0.462857, ]
Wrt. the master baseliens:
8x R_ay_2rnn_kw - 0.470143 (95% [0.444607, 0.495678]):
10911926.arien.ics.muni.cz.R_ay_2rnn_kw etc.
[0.432749, 0.442759, 0.479331, 0.504750, 0.480979, 0.422751, 0.501615, 0.496206, ]
4x R_al_2rnn_kw - 0.423874 (95% [0.406368, 0.441380]):
10911924.arien.ics.muni.cz.R_al_2rnn_kw etc.
[0.418824, 0.418950, 0.442729, 0.414993, ]
4x R_al_2a51_kw - 0.457016 (95% [0.434550, 0.479482]):
10930683.arien.ics.muni.cz.R_al_2a51_kw etc.
[0.469147, 0.470216, 0.453156, 0.435544, ]
8x R_ay_2rnnd0_kw - 0.434593 (95% [0.420952, 0.448234]):
10911927.arien.ics.muni.cz.R_ay_2rnnd0_kw etc.
[0.446196, 0.467435, 0.432201, 0.426130, 0.432023, 0.435823, 0.405995, 0.430943, ]
4x R_al_2rnnd0_kw - 0.446685 (95% [0.438853, 0.454517]):
10911925.arien.ics.muni.cz.R_al_2rnnd0_kw etc.
[0.439836, 0.452840, 0.444595, 0.449469, ]
Same trend as with Ubuntu - with large dataset, dropout advantage tapers off.
TODO transfer learning check
4x R_al_2rnnd0L0_kw - 0.442456 (95% [0.431101, 0.453812]):
10930681.arien.ics.muni.cz.R_al_2rnnd0L0_kw etc.
[0.440065, 0.441733, 0.453828, 0.434200, ]
4x R_al_2a51d0L0_kw - 0.441782 (95% [0.415093, 0.468470]):
10930684.arien.ics.muni.cz.R_al_2a51d0L0_kw etc.
[0.467851, 0.422707, 0.432930, 0.443639, ]
Let's check if forcing _PAD_
to zero (part of the argus clasrel pull request) is harmless.
16x R_ay_2rnn_kw - 0.469423 (95% [0.455889, 0.482957]):
10911926.arien.ics.muni.cz.R_ay_2rnn_kw etc.
[0.432749, 0.442759, 0.479331, 0.504750, 0.480979, 0.422751, 0.501615, 0.496206, 0.485784, 0.452306, 0.443067, 0.458959, 0.505849, 0.474536, 0.471225, 0.457896, ]
16x R_ay_2a51_kw - 0.485543 (95% [0.476930, 0.494155]):
11123025.arien.ics.muni.cz.R_ay_2a51_kw etc.
[0.477314, 0.495001, 0.475405, 0.482468, 0.470529, 0.492458, 0.498071, 0.475183, 0.461710, 0.480581, 0.484353, 0.524057, 0.514815, 0.473767, 0.469940, 0.493029, ]
8x R_ay_2rnn_kw_pad0 - 0.450365 (95% [0.430197, 0.470533]):
11141639.arien.ics.muni.cz.R_ay_2rnn_kw_pad0 etc.
[0.441757, 0.446003, 0.430938, 0.493640, 0.425181, 0.430856, 0.448221, 0.486323, ]
8x R_ay_2a51_kw_pad0 - 0.496796 (95% [0.489058, 0.504534]):
11141678.arien.ics.muni.cz.R_ay_2a51_kw_pad0 etc.
[0.502875, 0.506105, 0.484980, 0.494001, 0.479094, 0.504348, 0.501516, 0.501447, ]
Ok, looks harmless enough.
Model | trainAllMRR | devMRR | testMAP | testMRR | settings |
---|---|---|---|---|---|
yodaqakw | 0.368773 | 0.337348 | 0.284100 | 0.383238 | (defaults) |
termfreq BM25 #w | 0.483538 | 0.452647 | 0.294300 | 0.484530 | (defaults) |
-------------------------- | ------------- | ---------- | ---------- | ---------- | --------- |
avg | 0.422881 | 0.402618 | 0.229694 | 0.329356 | (defaults) |
±0.024685 | ±0.006664 | ±0.001715 | ±0.003511 | ||
DAN | 0.437119 | 0.430754 | 0.233000 | 0.354075 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.014494 | ±0.014477 | ±0.002657 | ±0.010307 | ||
rnn | 0.459869 | 0.429780 | 0.228869 | 0.341706 | (defaults) |
±0.035981 | ±0.015609 | ±0.005554 | ±0.010643 | ||
cnn | 0.544067 | 0.363028 | 0.228538 | 0.309165 | (defaults) |
±0.037730 | ±0.011041 | ±0.004791 | ±0.009649 | ||
rnncnn | 0.578608 | 0.374195 | 0.238200 | 0.344659 | (defaults) |
±0.044228 | ±0.023533 | ±0.007741 | ±0.014747 | ||
attn1511 | 0.432403 | 0.475125 | 0.275219 | 0.468555 | (defaults) |
±0.016183 | ±0.012810 | ±0.006562 | ±0.014433 | ||
-------------------------- | ------------- | ---------- | ---------- | ---------- | --------- |
avg | 0.487246 | 0.451062 | 0.250563 | 0.370919 | f_add_kw=True |
±0.046523 | ±0.007836 | ±0.005624 | ±0.008380 | ||
DAN | 0.492934 | 0.483218 | 0.279650 | 0.441829 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu' f_add_kw=True
|
±0.037740 | ±0.007931 | ±0.004544 | ±0.009156 | ||
rnn | 0.488602 | 0.469423 | 0.255750 | 0.403185 | f_add_kw=True |
±0.030025 | ±0.013534 | ±0.005382 | ±0.010489 | ||
cnn | 0.572758 | 0.410014 | 0.248494 | 0.350063 | f_add_kw=True |
±0.025883 | ±0.012990 | ±0.005084 | ±0.010683 | ||
rnncnn | 0.555559 | 0.419693 | 0.259669 | 0.386323 | f_add_kw=True |
±0.035131 | ±0.019007 | ±0.005742 | ±0.015534 | ||
attn1511 | 0.475656 | 0.485543 | 0.299025 | 0.473519 | f_add_kw=True |
±0.014700 | ±0.008612 | ±0.004635 | ±0.004926 |
16x R_ay_2avg_preBM25f - 0.456198 (95% [0.441375, 0.471021]):
11164666.arien.ics.muni.cz.R_ay_2avg_preBM25f etc.
[0.498219, 0.479567, 0.484712, 0.504871, 0.490978, 0.437424, 0.426823, 0.441328, 0.457319, 0.437506, 0.418024, 0.427199, 0.428851, 0.431862, 0.469301, 0.465179, ]
16x R_ay_2dan_preBM25f - 0.529542 (95% [0.518664, 0.540419]):
11164667.arien.ics.muni.cz.R_ay_2dan_preBM25f etc.
[0.550406, 0.532501, 0.501926, 0.502845, 0.524899, 0.518520, 0.554686, 0.509359, 0.533622, 0.554307, 0.534756, 0.531446, 0.509179, 0.565314, 0.500386, 0.548515, ]
15x R_ay_2rnn_preBM25f - 0.495364 (95% [0.487221, 0.503508]):
11164668.arien.ics.muni.cz.R_ay_2rnn_preBM25f etc.
[0.488752, 0.493261, 0.513562, 0.466224, 0.497205, 0.477107, 0.498736, 0.486709, 0.527588, 0.479961, 0.497441, 0.502171, 0.502466, 0.509093, 0.490191, ]
16x R_ay_2cnn_preBM25f - 0.483106 (95% [0.472069, 0.494143]):
11164669.arien.ics.muni.cz.R_ay_2cnn_preBM25f etc.
[0.488898, 0.472309, 0.493535, 0.506969, 0.505426, 0.470859, 0.468073, 0.483682, 0.480535, 0.478417, 0.486767, 0.454820, 0.473261, 0.441064, 0.493540, 0.531538, ]
16x R_ay_2rnncnn_preBM25f - 0.481229 (95% [0.465358, 0.497100]):
11164670.arien.ics.muni.cz.R_ay_2rnncnn_preBM25f etc.
[0.505181, 0.483252, 0.451131, 0.454070, 0.433947, 0.536033, 0.446546, 0.537421, 0.490100, 0.490110, 0.461282, 0.499851, 0.476747, 0.468474, 0.457429, 0.508093, ]
16x R_ay_2a51_preBM25f - 0.475746 (95% [0.464648, 0.486844]):
11164671.arien.ics.muni.cz.R_ay_2a51_preBM25f etc.
[0.427083, 0.514587, 0.464229, 0.494987, 0.466147, 0.464223, 0.487800, 0.455298, 0.512902, 0.482585, 0.471736, 0.463600, 0.475212, 0.473154, 0.487878, 0.470510, ]
BM25 prescoring is even better than kw prescoring! (Also, high DAN performance is intriguing in principle...)
16x R_ay_2cnn_preBM25P20_i0d0w0 - 0.529302 (95% [0.513541, 0.545063]):
11233395.arien.ics.muni.cz.R_ay_2cnn_preBM25P20_i0d0w0 etc.
[0.560938, 0.539073, 0.447483, 0.549825, 0.523195, 0.541627, 0.535003, 0.466559, 0.529262, 0.524238, 0.544389, 0.526275, 0.559718, 0.547152, 0.536153, 0.537940, ]
Model | trainAllMRR | devMRR | testMAP | testMRR | settings |
---|---|---|---|---|---|
yodaqakw | 0.332693 | 0.318246 | 0.303900 | 0.376465 | (defaults) |
termfreq BM25 #w | 0.441573 | 0.432115 | 0.313900 | 0.490822 | (defaults) |
-------------------------- | ------------- | ---------- | ---------- | ---------- | --------- |
avg | 0.798883 | 0.408034 | 0.262569 | 0.362190 | (defaults) |
±0.026554 | ±0.004656 | ±0.002054 | ±0.005725 | ||
DAN | 0.646481 | 0.404210 | 0.272675 | 0.386522 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.070994 | ±0.005378 | ±0.003028 | ±0.007627 | ||
rnn | 0.460984 | 0.382949 | 0.262463 | 0.381298 | (defaults) |
±0.023715 | ±0.006451 | ±0.002641 | ±0.007643 | ||
cnn | 0.550441 | 0.348247 | 0.264476 | 0.353243 | (defaults) |
±0.069701 | ±0.006217 | ±0.002918 | ±0.009620 | ||
rnncnn | 0.681908 | 0.408662 | 0.286118 | 0.394865 | (defaults) |
±0.114967 | ±0.008659 | ±0.003501 | ±0.011895 | ||
attn1511 | 0.445635 | 0.408495 | 0.288100 | 0.430892 | (defaults) |
±0.056352 | ±0.008744 | ±0.005601 | ±0.017858 | ||
-------------------------- | ------------- | ---------- | ---------- | ---------- | --------- |
avg | 0.647144 | 0.420943 | 0.289044 | 0.419559 | f_add_kw=True |
±0.068187 | ±0.004745 | ±0.002541 | ±0.011235 | ||
DAN | 0.578884 | 0.454751 | 0.316606 | 0.472173 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu' f_add_kw=True
|
±0.051564 | ±0.005778 | ±0.004260 | ±0.006205 | ||
rnn | 0.471287 | 0.423417 | 0.296419 | 0.446478 | f_add_kw=True |
±0.021866 | ±0.007853 | ±0.005486 | ±0.011307 | ||
cnn | 0.532295 | 0.375244 | 0.285288 | 0.398820 | f_add_kw=True |
±0.052085 | ±0.006402 | ±0.002901 | ±0.009145 | ||
rnncnn | 0.595107 | 0.430172 | 0.308475 | 0.444440 | f_add_kw=True |
±0.091860 | ±0.010868 | ±0.005538 | ±0.013976 | ||
attn1511 | 0.488763 | 0.455023 | 0.330781 | 0.492604 | f_add_kw=True |
±0.015243 | ±0.006933 | ±0.002899 | ±0.005126 |
7x R_al_2rnn_preBM25 - 0.430736 (95% [0.416538, 0.444934]):
11206422.arien.ics.muni.cz.R_al_2rnn_preBM25 etc.
[0.425198, 0.445204, 0.420176, 0.455230, 0.410737, 0.417173, 0.441434, ]
8x R_al_2rnn_preBM25p20 - 0.458792 (95% [0.444031, 0.473553]):
11206423.arien.ics.muni.cz.R_al_2rnn_preBM25p20 etc.
[0.432324, 0.474313, 0.485069, 0.463772, 0.457373, 0.433791, 0.471632, 0.452063, ]
8x R_al_2dan_preBM25 - 0.461616 (95% [0.451250, 0.471982]):
11206425.arien.ics.muni.cz.R_al_2dnn_preBM25 etc.
[0.460337, 0.446763, 0.457631, 0.470528, 0.446659, 0.475015, 0.482622, 0.453372, ]
8x R_al_2dan_preBM25p20 - 0.466494 (95% [0.458017, 0.474971]):
11206426.arien.ics.muni.cz.R_al_2dnn_preBM25p20 etc.
[0.449215, 0.481465, 0.468312, 0.462902, 0.454681, 0.469513, 0.467835, 0.478032, ]
4x R_al_2cnn_preBM25 - 0.408968 (95% [0.377780, 0.440156]):
11206501.arien.ics.muni.cz.R_al_2cnn_preBM25 etc.
[0.429302, 0.426465, 0.382930, 0.397176, ]
4x R_al_2a51_preBM25 - 0.443483 (95% [0.431687, 0.455280]):
11206502.arien.ics.muni.cz.R_al_2a51_preBM25 etc.
[0.442740, 0.446063, 0.432295, 0.452835, ]
Overall:
16x R_al_2avg_preBM25P20 - 0.461915 (95% [0.457745, 0.466086]):
11233427.arien.ics.muni.cz.R_al_2avg_preBM25P20 etc.
[0.471524, 0.462527, 0.465099, 0.449557, 0.460591, 0.450419, 0.457353, 0.456975, 0.477616, 0.468689, 0.469820, 0.455841, 0.454979, 0.464222, 0.455686, 0.469746, ]
16x R_al_2dan_preBM25P20 - 0.466123 (95% [0.462176, 0.470069]):
11233428.arien.ics.muni.cz.R_al_2dan_preBM25P20 etc.
[0.469671, 0.460734, 0.450934, 0.460854, 0.464449, 0.466615, 0.471752, 0.470239, 0.466706, 0.467664, 0.461909, 0.464538, 0.478914, 0.460452, 0.482539, 0.459995, ]
16x R_al_2rnn_preBM25P20 - 0.456415 (95% [0.449226, 0.463605]):
11233429.arien.ics.muni.cz.R_al_2rnn_preBM25P20 etc.
[0.472133, 0.441332, 0.458510, 0.450369, 0.480289, 0.444070, 0.452240, 0.461032, 0.449289, 0.442795, 0.429064, 0.455386, 0.459916, 0.459284, 0.468555, 0.478382, ]
16x R_al_2cnn_preBM25P20 - 0.463438 (95% [0.460091, 0.466785]):
11233430.arien.ics.muni.cz.R_al_2cnn_preBM25P20 etc.
[0.472172, 0.465146, 0.461809, 0.455908, 0.470847, 0.459910, 0.462170, 0.453509, 0.466360, 0.463259, 0.468308, 0.470299, 0.452752, 0.462315, 0.457224, 0.473015, ]
16x R_al_2rnncnn_preBM25P20 - 0.479459 (95% [0.474738, 0.484179]):
11233431.arien.ics.muni.cz.R_al_2rnncnn_preBM25P20 etc.
[0.483343, 0.492789, 0.461518, 0.474034, 0.468161, 0.470757, 0.490504, 0.488389, 0.486903, 0.474913, 0.473768, 0.481255, 0.484023, 0.490777, 0.473109, 0.477093, ]
16x R_al_2a51_preBM25P20 - 0.462170 (95% [0.455248, 0.469092]):
11233432.arien.ics.muni.cz.R_al_2a51_preBM25P20 etc.
[0.448268, 0.458436, 0.458336, 0.442503, 0.491185, 0.480569, 0.466483, 0.458774, 0.481210, 0.460603, 0.450264, 0.460054, 0.454358, 0.446956, 0.467947, 0.468770, ]
What if no dropout?
8x R_al_2avg_preBM25P20_i0d0w0 - 0.465999 (95% [0.459798, 0.472200]):
11226136.arien.ics.muni.cz.R_al_2avg_preBM25P20_i0d0w0 etc.
[0.479083, 0.475092, 0.464605, 0.467555, 0.458380, 0.466827, 0.459283, 0.457167, ]
8x R_al_2dan_preBM25P20_i0d0w0 - 0.460120 (95% [0.454159, 0.466080]):
11226138.arien.ics.muni.cz.R_al_2dan_preBM25P20_i0d0w0 etc.
[0.451056, 0.461086, 0.448264, 0.457592, 0.460070, 0.465834, 0.469669, 0.467385, ]
8x R_al_2rnn_preBM25P20_i0d0w0 - 0.436054 (95% [0.427893, 0.444216]):
11226139.arien.ics.muni.cz.R_al_2rnn_preBM25P20_i0d0w0 etc.
[0.431153, 0.423863, 0.436515, 0.447881, 0.451811, 0.425841, 0.442645, 0.428726, ]
8x R_al_2cnn_preBM25P20_i0d0w0 - 0.480980 (95% [0.474132, 0.487828]):
11226140.arien.ics.muni.cz.R_al_2cnn_preBM25P20_i0d0w0 etc.
[0.463730, 0.480296, 0.491250, 0.480813, 0.475103, 0.487560, 0.480675, 0.488414, ]
8x R_al_2rnncnn_preBM25P20_i0d0w0 - 0.488244 (95% [0.480346, 0.496142]):
11226142.arien.ics.muni.cz.R_al_2rnncnn_preBM25P20_i0d0w0 etc.
[0.477642, 0.480781, 0.485465, 0.476986, 0.503215, 0.495628, 0.486586, 0.499649, ]
8x R_al_2a51_preBM25P20_i0d0w0 - 0.470792 (95% [0.464074, 0.477510]):
11226143.arien.ics.muni.cz.R_al_2a51_preBM25P20_i0d0w0 etc.
[0.463315, 0.467176, 0.487459, 0.465585, 0.477057, 0.475802, 0.467270, 0.462674, ]
8x R_al_2avg_preBM25P20_i0d0w0_ef1 - 0.458608 (95% [0.451086, 0.466129]):
11229294.arien.ics.muni.cz.R_al_2avg_preBM25P20_i0d0w0_ef1 etc.
[0.448979, 0.457682, 0.469837, 0.453343, 0.476811, 0.452102, 0.454278, 0.455830, ]
8x R_al_2dan_preBM25P20_i0d0w0_ef1 - 0.465446 (95% [0.455627, 0.475265]):
11229295.arien.ics.muni.cz.R_al_2dan_preBM25P20_i0d0w0_ef1 etc.
[0.445543, 0.488141, 0.467816, 0.470790, 0.468154, 0.454502, 0.460534, 0.468090, ]
8x R_al_2rnn_preBM25P20_i0d0w0_ef1 - 0.447995 (95% [0.445026, 0.450964]):
11229296.arien.ics.muni.cz.R_al_2rnn_preBM25P20_i0d0w0_ef1 etc.
[0.448334, 0.445451, 0.447079, 0.452070, 0.442781, 0.453125, 0.444168, 0.450952, ]
8x R_al_2cnn_preBM25P20_i0d0w0_ef1 - 0.482116 (95% [0.477632, 0.486600]):
11229297.arien.ics.muni.cz.R_al_2cnn_preBM25P20_i0d0w0_ef1 etc.
[0.480895, 0.478787, 0.479880, 0.484137, 0.474332, 0.480507, 0.484450, 0.493941, ]
8x R_al_2rnncnn_preBM25P20_i0d0w0_ef1 - 0.473144 (95% [0.468930, 0.477359]):
11229298.arien.ics.muni.cz.R_al_2rnncnn_preBM25P20_i0d0w0_ef1 etc.
[0.467316, 0.468593, 0.475112, 0.483169, 0.475523, 0.476252, 0.468228, 0.470961, ]
8x R_al_2a51_preBM25P20_i0d0w0_ef1 - 0.480831 (95% [0.473527, 0.488135]):
11229299.arien.ics.muni.cz.R_al_2a51_preBM25P20_i0d0w0_ef1 etc.
[0.480658, 0.467799, 0.480404, 0.487879, 0.477993, 0.469366, 0.495353, 0.487194, ]
Well, default ef (1/4) looks fine here.
Dropout is fine, except for cnn where it is actually clearly a bad idea.
Model | trainAllMRR | devMRR | testMAP | testMRR | settings |
---|---|---|---|---|---|
termfreq BM25 #w | 0.813992 | 0.829004 | 0.630100 | 0.765363 | (defaults) |
-------------------------- | ------------- | ---------- | ---------- | ---------- | --------- |
avg | 0.786983 | 0.799939 | 0.607031 | 0.689948 | (defaults) |
±0.019449 | ±0.007218 | ±0.005516 | ±0.009912 | ||
DAN | 0.838842 | 0.828035 | 0.643288 | 0.734727 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.013775 | ±0.007839 | ±0.009993 | ±0.008747 | ||
rnn | 0.791770 | 0.842155 | 0.648863 | 0.742747 | (defaults) |
±0.017036 | ±0.009447 | ±0.010918 | ±0.009896 | ||
cnn | 0.845162 | 0.841343 | 0.690906 | 0.770042 | (defaults) |
±0.015552 | ±0.005409 | ±0.006910 | ±0.010381 | ||
rnncnn | 0.922721 | 0.849363 | 0.716519 | 0.797826 | (defaults) |
±0.019407 | ±0.006259 | ±0.007169 | ±0.011460 | ||
attn1511 | 0.852364 | 0.851368 | 0.708163 | 0.789822 | (defaults) |
±0.017280 | ±0.005533 | ±0.008958 | ±0.013308 |
32x R_aw_2avg_preBM25f - 0.859210 (95% [0.855743, 0.862678]):
11165383.arien.ics.muni.cz.R_aw_2avg_preBM25f etc.
[0.860256, 0.865385, 0.866667, 0.860256, 0.844872, 0.861538, 0.866667, 0.862821, 0.869231, 0.851282, 0.880769, 0.857692, 0.855128, 0.852564, 0.844219, 0.855128, 0.869231, 0.860256, 0.856410, 0.862821, 0.837179, 0.876154, 0.857692, 0.844872, 0.869231, 0.862821, 0.851282, 0.870513, 0.858974, 0.843590, 0.860256, 0.858974, ]
32x R_aw_2dan_preBM25f - 0.850233 (95% [0.846307, 0.854159]):
11165384.arien.ics.muni.cz.R_aw_2dan_preBM25f etc.
[0.836923, 0.842821, 0.840769, 0.856410, 0.863333, 0.864103, 0.853077, 0.830513, 0.858205, 0.847949, 0.831282, 0.841026, 0.862051, 0.849231, 0.852308, 0.850513, 0.855641, 0.868462, 0.864615, 0.840886, 0.859487, 0.835128, 0.846795, 0.847949, 0.846154, 0.840897, 0.839744, 0.852465, 0.865385, 0.858974, 0.867949, 0.836410, ]
16x R_aw_2rnn_preBM25f - 0.872479 (95% [0.865383, 0.879576]):
11165385.arien.ics.muni.cz.R_aw_2rnn_preBM25f etc.
[0.891026, 0.876154, 0.883333, 0.867949, 0.871538, 0.871795, 0.887692, 0.883333, 0.847436, 0.882051, 0.880403, 0.847179, 0.876923, 0.853846, 0.859890, 0.879121, ]
16x R_aw_2cnn_preBM25f - 0.867242 (95% [0.862184, 0.872300]):
11165386.arien.ics.muni.cz.R_aw_2cnn_preBM25f etc.
[0.856923, 0.872308, 0.854274, 0.869231, 0.875641, 0.852564, 0.880427, 0.865385, 0.856410, 0.870513, 0.878205, 0.863333, 0.877839, 0.865385, 0.856667, 0.880769, ]
16x R_aw_2rnncnn_preBM25f - 0.862151 (95% [0.856422, 0.867880]):
11165387.arien.ics.muni.cz.R_aw_2rnncnn_preBM25f etc.
[0.860256, 0.858718, 0.839744, 0.872308, 0.856410, 0.842308, 0.868462, 0.860403, 0.858974, 0.873590, 0.871429, 0.861538, 0.882564, 0.862051, 0.870385, 0.855275, ]
16x R_aw_2a51_preBM25f - 0.864038 (95% [0.859672, 0.868405]):
11165388.arien.ics.muni.cz.R_aw_2a51_preBM25f etc.
[0.855128, 0.871795, 0.872308, 0.875641, 0.855641, 0.859487, 0.847436, 0.862821, 0.864615, 0.866667, 0.871795, 0.859487, 0.852564, 0.865385, 0.874872, 0.868974, ]
BM25 prescoring is again awesome, though it doesn't help that much as the source dataset already comes from IR system that probably uses BM25.
8x R_aw_2cnn_preBM25P20_i0d0w0 - 0.868741 (95% [0.858212, 0.879270]):
11233397.arien.ics.muni.cz.R_aw_2cnn_preBM25P20_i0d0w0 etc.
[0.870513, 0.850916, 0.857070, 0.892308, 0.876923, 0.865385, 0.858974, 0.877839, ]
Pruning test:
16x R_aw_2avg_preBM25P20 - 0.857195 (95% [0.852724, 0.861667]):
11236063.arien.ics.muni.cz.R_aw_2avg_preBM25P20 etc.
[0.855128, 0.866667, 0.862821, 0.855128, 0.873077, 0.847436, 0.860256, 0.855128, 0.860256, 0.853077, 0.861538, 0.848718, 0.835897, 0.855128, 0.860769, 0.864103, ]
16x R_aw_2dan_preBM25P20 - 0.860895 (95% [0.857066, 0.864724]):
11236064.arien.ics.muni.cz.R_aw_2dan_preBM25P20 etc.
[0.845501, 0.856923, 0.863736, 0.849231, 0.863333, 0.856923, 0.864615, 0.872308, 0.858120, 0.868462, 0.863846, 0.871795, 0.861172, 0.857326, 0.865897, 0.855128, ]
16x R_aw_2rnn_preBM25P20 - 0.858265 (95% [0.853689, 0.862840]):
11236065.arien.ics.muni.cz.R_aw_2rnn_preBM25P20 etc.
[0.858718, 0.857436, 0.860128, 0.856044, 0.855128, 0.867949, 0.839744, 0.858974, 0.843223, 0.850513, 0.863974, 0.865897, 0.862051, 0.866300, 0.852564, 0.873590, ]
16x R_aw_2cnn_preBM25P20 - 0.863481 (95% [0.855863, 0.871100]):
11236066.arien.ics.muni.cz.R_aw_2cnn_preBM25P20 etc.
[0.875641, 0.865385, 0.851282, 0.859915, 0.883333, 0.867949, 0.839744, 0.853846, 0.864615, 0.861538, 0.870513, 0.832564, 0.861538, 0.859890, 0.887179, 0.880769, ]
16x R_aw_2rnncnn_preBM25P20 - 0.855948 (95% [0.849122, 0.862774]):
11236067.arien.ics.muni.cz.R_aw_2rnncnn_preBM25P20 etc.
[0.836557, 0.844231, 0.847949, 0.843736, 0.864103, 0.863333, 0.861538, 0.837821, 0.867949, 0.860769, 0.876923, 0.850000, 0.877436, 0.868205, 0.850000, 0.844615, ]
16x R_aw_2a51_preBM25P20 - 0.868992 (95% [0.864326, 0.873657]):
11236068.arien.ics.muni.cz.R_aw_2a51_preBM25P20 etc.
[0.863480, 0.879487, 0.851795, 0.874872, 0.876154, 0.877436, 0.865897, 0.858974, 0.856410, 0.876026, 0.871026, 0.858718, 0.868974, 0.882564, 0.873590, 0.868462, ]
cnn without dropout?
16x R_aw_2cnn_preBM25P20_i0d0w0 - 0.872277 (95% [0.865844, 0.878710]):
11233397.arien.ics.muni.cz.R_aw_2cnn_preBM25P20_i0d0w0 etc.
[0.870513, 0.850916, 0.857070, 0.892308, 0.876923, 0.865385, 0.858974, 0.877839, 0.880769, 0.875018, 0.876923, 0.873223, 0.885897, 0.855275, 0.891453, 0.867949, ]
4x R_ay_2dan_preBM25f - 0.521920 (95% [0.489241, 0.554598]):
4x R_ay_2dan_preBM25fp10 - 0.498505 (95% [0.459013, 0.537998]):
11164679.arien.ics.muni.cz.R_ay_2dan_preBM25fp10 etc.
[0.515782, 0.458379, 0.497800, 0.522061, ]
4x R_ay_2dan_preBM25fp20 - 0.472205 (95% [0.413860, 0.530549]):
11164672.arien.ics.muni.cz.R_ay_2dan_preBM25fp20 etc.
[0.529796, 0.427913, 0.467881, 0.463228, ]
4x R_ay_2dan_preBM25fp40 - 0.498207 (95% [0.471961, 0.524453]):
11164683.arien.ics.muni.cz.R_ay_2dan_preBM25fp40 etc.
[0.511627, 0.507601, 0.503527, 0.470072, ]
4x R_ay_2dan_preBM25fp80 - 0.498641 (95% [0.483196, 0.514086]):
11164704.arien.ics.muni.cz.R_ay_2dan_preBM25fp80 etc.
[0.511180, 0.499969, 0.483906, 0.499509, ]
4x R_ay_2rnn_preBM25f - 0.490450 (95% [0.463683, 0.517216]):
4x R_ay_2rnn_preBM25fp10 - 0.451735 (95% [0.431087, 0.472382]):
11164718.arien.ics.muni.cz.R_ay_2rnn_preBM25fp10 etc.
[0.429715, 0.462510, 0.459515, 0.455199, ]
4x R_ay_2rnn_preBM25fp20 - 0.498031 (95% [0.480156, 0.515906]):
11164710.arien.ics.muni.cz.R_ay_2rnn_preBM25fp20 etc.
[0.508158, 0.483139, 0.491218, 0.509608, ]
4x R_ay_2rnn_preBM25fp40 - 0.462664 (95% [0.435296, 0.490033]):
11164714.arien.ics.muni.cz.R_ay_2rnn_preBM25fp40 etc.
[0.460831, 0.472898, 0.481267, 0.435661, ]
non-conclusive; next step: 32-way baseline vs. p20.
16x R_ay_2dan_preBM25f - 0.529542 (95% [0.518664, 0.540419]):
15x R_ay_2rnn_preBM25f - 0.495364 (95% [0.487221, 0.503508]):
32x R_ay_2dan_preBM25fp20 - 0.505769 (95% [0.486799, 0.524739]):
11164672.arien.ics.muni.cz.R_ay_2dan_preBM25fp20 etc.
[0.529796, 0.427913, 0.467881, 0.463228, 0.525758, 0.479401, 0.521772, 0.517636, 0.523367, 0.539304, 0.531952, 0.532263, 0.530724, 0.527800, 0.520635, 0.531143, 0.518818, 0.526477, 0.534160, 0.546105, 0.463003, 0.519570, 0.454251, 0.532180, 0.522453, 0.526706, 0.512854, 0.514134, 0.257657, 0.532079, 0.531198, 0.522390, ]
32x R_ay_2rnn_preBM25fp20 - 0.486556 (95% [0.479236, 0.493876]):
11164710.arien.ics.muni.cz.R_ay_2rnn_preBM25fp20 etc.
[0.508158, 0.483139, 0.491218, 0.509608, 0.469012, 0.494205, 0.483975, 0.480952, 0.461927, 0.504582, 0.461186, 0.460236, 0.481414, 0.480188, 0.457711, 0.461436, 0.491194, 0.483480, 0.478641, 0.484989, 0.494288, 0.466136, 0.491828, 0.526553, 0.473151, 0.493774, 0.466335, 0.492406, 0.509150, 0.465757, 0.525069, 0.538084, ]
32x R_aw_2dan_preBM25f - 0.850233 (95% [0.846307, 0.854159]):
32x R_aw_2dan_preBM25fp5 - 0.863100 (95% [0.860764, 0.865435]):
11171392.arien.ics.muni.cz.R_aw_2dan_preBM25fp5 etc.
[0.871513, 0.854846, 0.856128, 0.862539, 0.870744, 0.868949, 0.871513, 0.861257, 0.862539, 0.861770, 0.861257, 0.854846, 0.863052, 0.857411, 0.859205, 0.870231, 0.865103, 0.859205, 0.869462, 0.868949, 0.861257, 0.859205, 0.870744, 0.856128, 0.862539, 0.871513, 0.872795, 0.868949, 0.865103, 0.845103, 0.855359, 0.859975, ]
32x R_aw_2dan_preBM25fp10 - 0.856020 (95% [0.852684, 0.859356]):
11171394.arien.ics.muni.cz.R_aw_2dan_preBM25fp10 etc.
[0.842637, 0.856740, 0.835861, 0.844689, 0.861355, 0.851099, 0.860073, 0.847766, 0.854945, 0.854176, 0.847912, 0.850330, 0.843919, 0.851612, 0.849817, 0.854945, 0.856740, 0.855458, 0.856227, 0.854945, 0.849817, 0.869560, 0.860073, 0.866484, 0.866996, 0.881868, 0.867766, 0.854945, 0.858791, 0.851099, 0.863150, 0.870842, ]
4x R_aw_2dan_preBM25fp20 - 0.855614 (95% [0.840413, 0.870814]):
11171395.arien.ics.muni.cz.R_aw_2dan_preBM25fp20 etc. [0.853077, 0.847436, 0.871795, 0.850147, ]
4x R_aw_2dan_preBM25fp40 - 0.876602 (95% [0.869336, 0.883869]):
11171397.arien.ics.muni.cz.R_aw_2dan_preBM25fp40 etc. [0.873077, 0.871795, 0.883333, 0.878205, ]
4x R_aw_2dan_preBM25fp60 - 0.852212 (95% [0.826522, 0.877901]):
11171399.arien.ics.muni.cz.R_aw_2dan_preBM25fp60 etc. [0.829359, 0.856410, 0.874359, 0.848718, ]
----
conclusion: pruning not detrimental (possibly slight difference either way), but massive speedup
----
general next step: apply to other tasks, transfer learning test
Parameter Tuning
----------------
32x R_ay_2rnn_preBM25fp20 - 0.486556 (95% [0.479236, 0.493876]):
16x R_ay_2rnn_preBM25fp20_d12 - 0.449365 (95% [0.438045, 0.460684]):
11192930.arien.ics.muni.cz.R_ay_2rnn_preBM25fp20_d12 etc. [0.459300, 0.413526, 0.443900, 0.468764, 0.462032, 0.435436, 0.428294, 0.457646, 0.471286, 0.458531, 0.410506, 0.487281, 0.453725, 0.421118, 0.463696, 0.454792, ]
8x rnnlevels=2 (with skipconnections already) R_al_2rnn_preBM25p20_L2 - 0.448833 (95% [0.440927, 0.456738]):
11242308.arien.ics.muni.cz.R_al_2rnn_preBM25p20_L2 etc. [0.456784, 0.434763, 0.440868, 0.463849, 0.442656, 0.454120, 0.441772, 0.455851, ]
16x R_ay_2a51_preBM25f - 0.475746 (95% [0.464648, 0.486844]):
16x R_ay_2a51_preBM25fp20 - 0.462924 (95% [0.454943, 0.470906]):
11192931.arien.ics.muni.cz.R_ay_2a51_preBM25fp20 etc. [0.454828, 0.475757, 0.465232, 0.475431, 0.439217, 0.477389, 0.475898, 0.462111, 0.470197, 0.430249, 0.470613, 0.440611, 0.475972, 0.448077, 0.473609, 0.471597, ]
16x R_ay_2a51_preBM25fp20_d0 - 0.476786 (95% [0.469887, 0.483685]):
11192932.arien.ics.muni.cz.R_ay_2a51_preBM25fp20_d0 etc. [0.476644, 0.481178, 0.460922, 0.464233, 0.469220, 0.463495, 0.471208, 0.467415, 0.473137, 0.505563, 0.468908, 0.485027, 0.482633, 0.478444, 0.474024, 0.506526, ]
16x R_ay_2a51_preBM25fp20_d0s2 - 0.474812 (95% [0.466648, 0.482976]):
11192937.arien.ics.muni.cz.R_ay_2a51_preBM25fp20_d0s2 etc. [0.490791, 0.452166, 0.467313, 0.516252, 0.468192, 0.467569, 0.478512, 0.468858, 0.459677, 0.475398, 0.470101, 0.479097, 0.471638, 0.457159, 0.477836, 0.496436, ]
4x R_al_2a51_preBM25 - 0.443483 (95% [0.431687, 0.455280]):
8x R_al_2a51_preBM25p20 - 0.460622 (95% [0.452172, 0.469071]):
11214582.arien.ics.muni.cz.R_al_2a51_preBM25p20 etc. [0.451943, 0.454549, 0.462602, 0.466850, 0.447732, 0.452256, 0.470065, 0.478977, ]
8x R_al_2a51_preBM25p20_i0w13 - 0.472262 (95% [0.457551, 0.486973]):
11214584.arien.ics.muni.cz.R_al_2a51_preBM25p20_d0w13 etc. [0.472762, 0.458824, 0.478270, 0.500431, 0.449539, 0.448243, 0.480109, 0.489919, ]
8x R_al_2a51_preBM25p20_i0w13_prelu - 0.474473 (95% [0.462383, 0.486563]):
11214585.arien.ics.muni.cz.R_al_2a51_preBM25p20_d0w13_prelu etc. [0.484829, 0.479326, 0.451413, 0.498001, 0.471630, 0.481713, 0.454565, 0.474307, ]
8x R_al_2a51_preBM25p20_d0w13 - 0.474366 (95% [0.460094, 0.488637]):
11215621.arien.ics.muni.cz.R_al_2a51_preBM25p20_d0w13 etc. [0.497713, 0.484796, 0.454662, 0.495258, 0.484598, 0.465900, 0.455081, 0.456918, ]
8x R_al_2a51_preBM25p20_d0w0 - 0.485359 (95% [0.476708, 0.494010]):
11215622.arien.ics.muni.cz.R_al_2a51_preBM25p20_d0w0 etc. [0.485409, 0.480781, 0.488977, 0.497029, 0.469303, 0.473415, 0.485694, 0.502266, ]
Ok, looks like DAN might win b/c of no dropout!
Let's check weird modes of attn1511...
16x R_al_2a51_preBM25P20 - 0.462170 (95% [0.455248, 0.469092]):
8x no-cnn R_al_2a51_preBM25p20_C - 0.468163 (95% [0.455092, 0.481234]):
11240080.arien.ics.muni.cz.R_al_2a51_preBM25p20_C etc. [0.452860, 0.481340, 0.449502, 0.491312, 0.466209, 0.455380, 0.459863, 0.488839, ]
8x no-cnn, focus_act=sigmoid/maxnorm R_al_2a51_preBM25p20_C_fsgmn - 0.481765 (95% [0.470110, 0.493420]):
11240081.arien.ics.muni.cz.R_al_2a51_preBM25p20_C_fsgmn etc. [0.498151, 0.482446, 0.453792, 0.483498, 0.485660, 0.499646, 0.482023, 0.468902, ]
8x no-cnn, ptscorer=1 R_al_2a51_preBM25p20_C1 - 0.481306 (95% [0.465911, 0.496702]):
11241046.arien.ics.muni.cz.R_al_2a51_preBM25p20_C1 etc. [0.470427, 0.506705, 0.457468, 0.466662, 0.501454, 0.503142, 0.480096, 0.464497, ]
8x no-cnn, ptscorer=1 R_al_2a51_preBM25p20_1 - 0.471499 (95% [0.460678, 0.482321]):
11241047.arien.ics.muni.cz.R_al_2a51_preBM25p20_1 etc. [0.482353, 0.476608, 0.487028, 0.464827, 0.448505, 0.481613, 0.475740, 0.455321, ]
8x focus_act=sigmoid/maxnorm R_al_2a51_preBM25p20_fsgmn - 0.468097 (95% [0.454369, 0.481825]):
11242053.arien.ics.muni.cz.R_al_2a51_preBM25p20_fsgmn etc. [0.471967, 0.447971, 0.472648, 0.486816, 0.436733, 0.466272, 0.485442, 0.476928, ]
Transfer Learning
-----------------
8x R_uay10648965rnn_mlp_bal_rmsprop - 0.529475 (95% [0.514200, 0.544750]):
11215958.arien.ics.muni.cz.R_uay10648965rnn_mlp_bal_rmsprop etc.
[0.522882, 0.545980, 0.541742, 0.506562, 0.562878, 0.512001, 0.530328, 0.513430, ]
16x R_uay10648965rnn_preBM25P20_mlp - 0.485414 (95% [0.476747, 0.494080]):
11214536.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20_mlp etc.
[0.495352, 0.468777, 0.480837, 0.480265, 0.464107, 0.476667, 0.483804, 0.510484, 0.516230, 0.457343, 0.474302, 0.493713, 0.509364, 0.483478, 0.494318, 0.477576, ]
16x R_uay10648965rnn_preBM25P20 - 0.408626 (95% [0.378800, 0.438452]):
11214592.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20 etc.
[0.449880, 0.359314, 0.340186, 0.323530, 0.450352, 0.446520, 0.470018, 0.454720, 0.427594, 0.453920, 0.401562, 0.483681, 0.402855, 0.435852, 0.324844, 0.313184, ]
16x R_uay10648965rnn_preBM25P20_bal_rmsprop - 0.388771 (95% [0.357806, 0.419735]):
11215654.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20_bal_rmsprop etc.
[0.453258, 0.268581, 0.401399, 0.302827, 0.412854, 0.462879, 0.433766, 0.345730, 0.370663, 0.365843, 0.405426, 0.458271, 0.463735, 0.346361, 0.405338, 0.323397, ]
16x R_uay10648965rnn_preBM25P20mlp_bal_rmsprop - 0.454084 (95% [0.447206, 0.460961]):
11215655.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20mlp_bal_rmsprop etc.
[0.449616, 0.470698, 0.451049, 0.450397, 0.439677, 0.455919, 0.440617, 0.444572, 0.450119, 0.442915, 0.466197, 0.462505, 0.483581, 0.441397, 0.443150, 0.472930, ]
16x R_uay10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 - 0.475303 (95% [0.466186, 0.484420]):
11225691.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 etc. [0.448802, 0.463850, 0.475736, 0.507712, 0.470277, 0.479006, 0.475616, 0.494298, 0.453132, 0.494382, 0.489861, 0.492906, 0.458863, 0.485691, 0.457334, 0.457378, ]
16x R_ual10648965rnn_preBM25P20mlp_bal_rmsprop - 0.451539 (95% [0.449595, 0.453483]):
11225687.arien.ics.muni.cz.R_ual10648965rnn_preBM25P20mlp_bal_rmsprop etc. [0.457980, 0.455706, 0.454605, 0.455440, 0.445675, 0.447934, 0.450084, 0.452073, 0.454488, 0.450616, 0.449980, 0.449009, 0.451489, 0.449900, 0.454953, 0.444689, ]
17x R_ual10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 - 0.484058 (95% [0.475866, 0.492250]):
11229119.arien.ics.muni.cz.R_ual10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 etc. [0.496242, 0.446991, 0.487968, 0.487219, 0.488113, 0.500758, 0.490518, 0.487409, 0.493103, 0.490052, 0.455980, 0.491550, 0.494241, 0.490836, 0.490761, 0.488457, 0.448789, ]
ef=1 seems interesting b/c of the small number of pairs now
16x R_uay10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 - 0.473516 (95% [0.466476, 0.480557]):
11236486.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 etc. [0.462663, 0.463296, 0.480945, 0.446515, 0.459608, 0.492760, 0.495482, 0.469041, 0.474507, 0.461925, 0.486585, 0.460019, 0.478681, 0.483390, 0.482902, 0.477942, ]
16x R_uay10648965rnn_preBM25P20dot_bal_rmsprop_ef1 - 0.436844 (95% [0.411885, 0.461803]):
11236487.arien.ics.muni.cz.R_uay10648965rnn_preBM25P20dot_bal_rmsprop_ef1 etc. [0.452369, 0.373691, 0.401155, 0.373203, 0.468152, 0.491609, 0.391696, 0.446588, 0.457276, 0.496784, 0.454121, 0.476732, 0.391180, 0.458268, 0.501988, 0.354688, ]
16x R_uaw10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 - 0.852837 (95% [0.851152, 0.854522]):
11236488.arien.ics.muni.cz.R_uaw10648965rnn_preBM25P20mlp_bal_rmsprop_ef1 etc. [0.851282, 0.852198, 0.849145, 0.855769, 0.850275, 0.853846, 0.853480, 0.852198, 0.853846, 0.857051, 0.853205, 0.858974, 0.856410, 0.849359, 0.852198, 0.846154, ]
11x R_uaw10648965rnn_preBM25P20dot_bal_rmsprop_ef1 - 0.837014 (95% [0.798067, 0.875961]):
11236491.arien.ics.muni.cz.R_uaw10648965rnn_preBM25P20dot_bal_rmsprop_ef1 etc. [0.849359, 0.882051, 0.854359, 0.807839, 0.894872, 0.888462, 0.853077, 0.894872, 0.764103, 0.816417, 0.701747, ]