-
Notifications
You must be signed in to change notification settings - Fork 205
1605EightGrade
All these experiments are done with the new BV_EP100 vocabulary mode as considered in 1605BigVocab.
We have three variants: r8c for ck12 memory snippet sources, r8e for enwiki memory snippet sources, and r8 which has the two merged together.
(acc is AbcdAccuracy)
r8c:
Model | trn Acc | val Acc | val MRR | tst Acc | tst MRR | settings |
---|---|---|---|---|---|---|
avg | 0.290301 | 0.290850 | 0.638889 | 0.293686 | 0.592443 | (defaults) |
±0.033995 | ±0.024967 | ±0.019934 | ±0.011219 | ±0.011900 | ||
DAN | 0.245902 | 0.303922 | 0.408565 | 0.287812 | 0.427867 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.013140 | ±0.030866 | ±0.057874 | ±0.015715 | ±0.046143 | ||
-------------------------- | ---------- | ---------- | ---------- | ---------- | ----------- | ---------- |
rnn | 0.722678 | 0.424837 | 0.769676 | 0.390602 | 0.680556 | (defaults) |
±0.156078 | ±0.049935 | ±0.041505 | ±0.033515 | ±0.028432 | ||
cnn | 0.300546 | 0.271242 | 0.618056 | 0.284141 | 0.590576 | (defaults) |
±0.040626 | ±0.030094 | ±0.028330 | ±0.021643 | ±0.011880 | ||
rnncnn | 0.353825 | 0.323529 | 0.668981 | 0.298091 | 0.608498 | (defaults) |
±0.092234 | ±0.044053 | ±0.043229 | ±0.021297 | ±0.019228 | ||
attn1511 | 0.477459 | 0.339869 | 0.687114 | 0.326725 | 0.630003 | (defaults) |
±0.193444 | ±0.038801 | ±0.029131 | ±0.037975 | ±0.030858 |
r8e:
Model | trn Acc | val Acc | val MRR | tst Acc | tst MRR | settings |
---|---|---|---|---|---|---|
avg | 0.280330 | 0.348958 | 0.636616 | 0.266049 | 0.547683 | (defaults) |
±0.044841 | ±0.037472 | ±0.028759 | ±0.018288 | ±0.009624 | ||
DAN | 0.244994 | 0.296875 | 0.534848 | 0.266049 | 0.513158 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.027805 | ±0.025047 | ±0.082815 | ±0.008811 | ±0.056038 | ||
-------------------------- | ---------- | ---------- | ---------- | ---------- | ----------- | ---------- |
rnn | 0.463486 | 0.361979 | 0.642424 | 0.295062 | 0.575405 | (defaults) |
±0.193759 | ±0.029051 | ±0.018036 | ±0.025256 | ±0.018475 | ||
cnn | 0.219670 | 0.299479 | 0.596212 | 0.261111 | 0.548752 | (defaults) |
±0.014743 | ±0.033359 | ±0.022407 | ±0.013417 | ±0.009095 | ||
rnncnn | 0.316254 | 0.335938 | 0.618182 | 0.258025 | 0.547402 | (defaults) |
±0.068402 | ±0.024596 | ±0.010901 | ±0.014658 | ±0.018273 | ||
attn1511 | 0.397527 | 0.390625 | 0.664141 | 0.236420 | 0.532276 | (defaults) |
±0.069729 | ±0.042338 | ±0.027718 | ±0.028422 | ±0.022669 |
r8:
Model | trn Acc | val Acc | val MRR | tst Acc | tst MRR | settings |
---|---|---|---|---|---|---|
avg | 0.261484 | 0.315104 | 0.594303 | 0.275309 | 0.537929 | (defaults) |
±0.040622 | ±0.029051 | ±0.018859 | ±0.027908 | ±0.019492 | ||
DAN | 0.290342 | 0.302083 | 0.575893 | 0.289506 | 0.542025 |
inp_e_dropout=0 inp_w_dropout=1/3 deep=2 pact='relu'
|
±0.017316 | ±0.024444 | ±0.016973 | ±0.010625 | ±0.009295 | ||
-------------------------- | ---------- | ---------- | ---------- | ---------- | ----------- | ---------- |
rnn | 0.392815 | 0.341146 | 0.611529 | 0.300617 | 0.558611 | (defaults) |
±0.165470 | ±0.025782 | ±0.035079 | ±0.014786 | ±0.010929 | ||
cnn | 0.213781 | 0.328125 | 0.568729 | 0.267284 | 0.515060 | (defaults) |
±0.009753 | ±0.051853 | ±0.039688 | ±0.034358 | ±0.066781 | ||
rnncnn | 0.285630 | 0.351562 | 0.620571 | 0.273457 | 0.546830 | (defaults) |
±0.041758 | ±0.045155 | ±0.037842 | ±0.019228 | ±0.011815 | ||
attn1511 | 0.476443 | 0.359375 | 0.619522 | 0.293210 | 0.551151 | (defaults) |
±0.187898 | ±0.035423 | ±0.027051 | ±0.044679 | ±0.024191 |
It seems that r8c is the best dataset to use - r8e or ensemble of both seems no good.
6x R_r8_2avgBV_EP100_L1e-4 - 0.463542 (95% [0.434107, 0.492976]):
11288548.arien.ics.muni.cz.R_r8_2avgBV_EP100_L1e-4 etc.
[0.468750, 0.468750, 0.468750, 0.468750, 0.500000, 0.406250, ]
6x R_r8_2danBV_EP100_L1e-4 - 0.460938 (95% [0.429898, 0.491977]):
11288549.arien.ics.muni.cz.R_r8_2danBV_EP100_L1e-4 etc.
[0.453125, 0.500000, 0.468750, 0.484375, 0.453125, 0.406250, ]
6x R_r8_2rnnBV_EP100_L1e-4 - 0.354167 (95% [0.326296, 0.382037]):
11288550.arien.ics.muni.cz.R_r8_2rnnBV_EP100_L1e-4 etc.
[0.312500, 0.328125, 0.359375, 0.390625, 0.359375, 0.375000, ]
5x R_r8_2cnnBV_EP100_L1e-4 - 0.400000 (95% [0.304481, 0.495519]):
11288551.arien.ics.muni.cz.R_r8_2cnnBV_EP100_L1e-4 etc.
[0.406250, 0.359375, 0.406250, 0.296875, 0.531250, ]
6x R_r8_2rnncnnBV_EP100_L1e-4 - 0.395833 (95% [0.319117, 0.472550]):
11288552.arien.ics.muni.cz.R_r8_2rnncnnBV_EP100_L1e-4 etc.
[0.484375, 0.406250, 0.437500, 0.421875, 0.375000, 0.250000, ]
6x R_r8_2a51BV_EP100_L1e-4 - 0.361979 (95% [0.324808, 0.399151]):
11288553.arien.ics.muni.cz.R_r8_2a51BV_EP100_L1e-4 etc.
[0.390625, 0.312500, 0.406250, 0.328125, 0.390625, 0.343750, ]
Seems like a good idea!
6x R_r8_2cnnBV_EP100_L1e-4_c121212 - 0.403646 (95% [0.356866, 0.450426]):
11288556.arien.ics.muni.cz.R_r8_2cnnBV_EP100_L1e-4_c121212 etc.
[0.406250, 0.312500, 0.453125, 0.437500, 0.406250, 0.406250, ]