Skip to content

Commit

Permalink
cand. gen. with stemmed emb. + s-gram cos. (data)
Browse files Browse the repository at this point in the history
  • Loading branch information
Lenz Furrer committed Jul 31, 2018
1 parent 96785f9 commit 3d106d7
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 50 deletions.
3 changes: 2 additions & 1 deletion config
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ prediction_fn = ${rootpath}/runs/predictions/${timestamp}.tsv
detailed_fn = ${rootpath}/runs/detailed/${timestamp}.{}.tsv

[candidates]
generator = PhraseVecFixedSet(10, "mean", "emb_stem")
generator = SGramCosine(.5, 10)
PhraseVecFixedSet(10, "mean", "emb_stem")
oracle = none
workers = 0

Expand Down
79 changes: 33 additions & 46 deletions log
Original file line number Diff line number Diff line change
@@ -1,46 +1,33 @@
2018-07-31 07:38:07,635 - 'pattern' package not found; tag filters are not available for English
2018-07-31 07:38:07,641 - loading terminology...
2018-07-31 07:38:07,856 - loading pretrained embeddings...
2018-07-31 07:38:07,857 - loading projection weights from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 07:38:53,703 - loaded (2231686, 200) matrix from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 07:38:57,688 - loading vectorizer...
2018-07-31 07:38:57,688 - loading candidate generator...
2018-07-31 07:38:57,689 - loading pretrained embeddings...
2018-07-31 07:38:57,689 - loading projection weights from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 07:39:43,628 - loaded (2231686, 200) matrix from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 07:41:06,475 - loading vectorizer...
2018-07-31 07:41:17,449 - preprocessing validation data...
2018-07-31 07:41:17,449 - loading corpus...
2018-07-31 07:41:17,459 - generating candidates with 0 workers...
2018-07-31 07:41:19,905 - generated 3680 pair-wise samples (7870 with duplicates)
2018-07-31 07:41:19,907 - compiling model architecture...
2018-07-31 07:41:31,438 - preprocessing training data...
2018-07-31 07:41:31,438 - loading corpus...
2018-07-31 07:41:31,486 - generating candidates with 0 workers...
2018-07-31 07:41:43,159 - generated 17100 pair-wise samples (51450 with duplicates)
2018-07-31 07:41:43,165 - training CNN...
2018-07-31 07:42:06,823 - Ranking accuracy: 0.632783
2018-07-31 07:42:33,161 - Ranking accuracy: 0.672173
2018-07-31 07:42:59,333 - Ranking accuracy: 0.695044
2018-07-31 07:43:25,963 - Ranking accuracy: 0.70521
2018-07-31 07:43:52,621 - Ranking accuracy: 0.700127
2018-07-31 07:44:15,388 - Ranking accuracy: 0.736976
2018-07-31 07:44:41,879 - Ranking accuracy: 0.7446
2018-07-31 07:45:09,455 - Ranking accuracy: 0.750953
2018-07-31 07:45:36,947 - Ranking accuracy: 0.747141
2018-07-31 07:46:00,250 - Ranking accuracy: 0.756036
2018-07-31 07:46:26,797 - Ranking accuracy: 0.759848
2018-07-31 07:46:53,455 - Ranking accuracy: 0.767471
2018-07-31 07:47:19,997 - Ranking accuracy: 0.770013
2018-07-31 07:47:46,816 - Ranking accuracy: 0.775095
2018-07-31 07:48:13,728 - Ranking accuracy: 0.775095
2018-07-31 07:48:36,227 - Ranking accuracy: 0.776366
2018-07-31 07:49:02,801 - Ranking accuracy: 0.782719
2018-07-31 07:49:29,332 - Ranking accuracy: 0.778907
2018-07-31 07:49:52,294 - Ranking accuracy: 0.778907
2018-07-31 07:49:52,295 - Epoch 00019: early stopping
2018-07-31 07:49:52,295 - done training.
2018-07-31 07:49:52,301 - load best model...
2018-07-31 07:50:04,107 - predict scores for validation data...
2018-07-31 07:50:05,411 - evaluate and/or serialize...
2018-07-31 07:50:05,439 - done.
2018-07-31 08:07:42,706 - 'pattern' package not found; tag filters are not available for English
2018-07-31 08:07:42,713 - loading terminology...
2018-07-31 08:07:42,922 - loading pretrained embeddings...
2018-07-31 08:07:42,922 - loading projection weights from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 08:08:28,042 - loaded (2231686, 200) matrix from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 08:08:32,130 - loading vectorizer...
2018-07-31 08:08:32,131 - loading candidate generator...
2018-07-31 08:08:43,654 - loading pretrained embeddings...
2018-07-31 08:08:43,654 - loading projection weights from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 08:09:29,441 - loaded (2231686, 200) matrix from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.bin
2018-07-31 08:10:53,552 - loading vectorizer...
2018-07-31 08:11:04,731 - preprocessing validation data...
2018-07-31 08:11:04,731 - loading corpus...
2018-07-31 08:11:04,741 - generating candidates with 0 workers...
2018-07-31 08:11:09,634 - generated 5578 pair-wise samples (11853 with duplicates)
2018-07-31 08:11:09,636 - compiling model architecture...
2018-07-31 08:11:21,755 - preprocessing training data...
2018-07-31 08:11:21,755 - loading corpus...
2018-07-31 08:11:21,803 - generating candidates with 0 workers...
2018-07-31 08:11:44,819 - generated 26131 pair-wise samples (77122 with duplicates)
2018-07-31 08:11:44,828 - training CNN...
2018-07-31 08:12:20,410 - Ranking accuracy: 0.757306
2018-07-31 08:12:59,131 - Ranking accuracy: 0.776366
2018-07-31 08:13:37,762 - Ranking accuracy: 0.767471
2018-07-31 08:14:12,247 - Ranking accuracy: 0.778907
2018-07-31 08:14:50,669 - Ranking accuracy: 0.776366
2018-07-31 08:15:26,574 - Ranking accuracy: 0.777637
2018-07-31 08:15:26,574 - Epoch 00006: early stopping
2018-07-31 08:15:26,575 - done training.
2018-07-31 08:15:26,582 - load best model...
2018-07-31 08:15:40,322 - predict scores for validation data...
2018-07-31 08:15:42,189 - evaluate and/or serialize...
2018-07-31 08:15:42,224 - done.
6 changes: 3 additions & 3 deletions results
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
accuracy 0.7827191867852605
correct 616
accuracy 0.7789072426937739
correct 613
total 787
unreachable 133
unreachable 104
nocandidates 0
ambiguous 2

0 comments on commit 3d106d7

Please sign in to comment.