Skip to content

Commit

Permalink
use subword-unit embeddings only (data)
Browse files Browse the repository at this point in the history
  • Loading branch information
Lenz Furrer committed Jun 20, 2018
1 parent ebd271a commit 376b1e1
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 41 deletions.
4 changes: 2 additions & 2 deletions config
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[DEFAULT]
timestamp = 20180620-102115
rootpath = /home/lenz/disease-normalization
timestamp = 20180620-102655
workers = 0

[general]
Expand Down Expand Up @@ -39,7 +39,7 @@ embedding_fn = ${rootpath}/data/embeddings/bpe_vectors_10000_50_w2v.txt
trainable = False

[rank]
embeddings = ["emb", "emb_sub"]
embeddings = ["emb_sub"]
n_kernels = 50
filter_width = 3
activation = tanh
Expand Down
75 changes: 39 additions & 36 deletions log
Original file line number Diff line number Diff line change
@@ -1,38 +1,41 @@
2018-06-20 10:21:17,417 - The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
2018-06-20 10:26:56,560 - The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

2018-06-20 10:21:21,003 - 'pattern' package not found; tag filters are not available for English
2018-06-20 10:21:21,011 - loading terminology...
2018-06-20 10:21:21,341 - loading pretrained embeddings...
2018-06-20 10:21:21,342 - loading projection weights from /home/lenz/disease-normalization/data/embeddings/wvec_50_haodi-li-et-al.bin
2018-06-20 10:21:26,664 - loaded (309058, 50) matrix from /home/lenz/disease-normalization/data/embeddings/wvec_50_haodi-li-et-al.bin
2018-06-20 10:21:27,131 - loading vectorizer...
2018-06-20 10:21:27,131 - loading pretrained embeddings...
2018-06-20 10:21:27,132 - loading projection weights from /home/lenz/disease-normalization/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-06-20 10:21:28,197 - loaded (10257, 50) matrix from /home/lenz/disease-normalization/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-06-20 10:21:28,209 - loading vectorizer...
2018-06-20 10:21:28,279 - loading candidate generator...
2018-06-20 10:21:44,536 - preprocessing validation data...
2018-06-20 10:21:44,536 - loading corpus...
2018-06-20 10:21:44,545 - generating candidates with 0 workers...
2018-06-20 10:21:48,033 - generated 5671 pair-wise samples (11585 with duplicates)
2018-06-20 10:21:48,035 - compiling model architecture...
2018-06-20 10:21:49,772 - preprocessing training data...
2018-06-20 10:21:49,772 - loading corpus...
2018-06-20 10:21:49,838 - generating candidates with 0 workers...
2018-06-20 10:22:12,981 - generated 26308 pair-wise samples (71125 with duplicates)
2018-06-20 10:22:12,991 - training CNN...
2018-06-20 10:22:45,777 - Ranking accuracy: 0.603558
2018-06-20 10:22:55,188 - Ranking accuracy: 0.659466
2018-06-20 10:23:04,362 - Ranking accuracy: 0.674714
2018-06-20 10:23:14,144 - Ranking accuracy: 0.675985
2018-06-20 10:23:23,346 - Ranking accuracy: 0.68615
2018-06-20 10:23:32,105 - Ranking accuracy: 0.697586
2018-06-20 10:23:41,308 - Ranking accuracy: 0.707751
2018-06-20 10:23:50,760 - Ranking accuracy: 0.711563
2018-06-20 10:23:59,715 - Ranking accuracy: 0.730623
2018-06-20 10:24:08,834 - Ranking accuracy: 0.739517
2018-06-20 10:24:18,445 - Ranking accuracy: 0.733164
2018-06-20 10:24:27,860 - Ranking accuracy: 0.731893
2018-06-20 10:24:27,860 - Epoch 00012: early stopping
2018-06-20 10:24:27,861 - done training.
2018-06-20 10:26:59,651 - 'pattern' package not found; tag filters are not available for English
2018-06-20 10:26:59,659 - loading terminology...
2018-06-20 10:26:59,972 - loading pretrained embeddings...
2018-06-20 10:26:59,973 - loading projection weights from /home/lenz/disease-normalization/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-06-20 10:27:00,835 - loaded (10257, 50) matrix from /home/lenz/disease-normalization/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-06-20 10:27:00,846 - loading vectorizer...
2018-06-20 10:27:00,957 - loading candidate generator...
2018-06-20 10:27:16,208 - preprocessing validation data...
2018-06-20 10:27:16,209 - loading corpus...
2018-06-20 10:27:16,224 - generating candidates with 0 workers...
2018-06-20 10:27:20,309 - generated 5671 pair-wise samples (11585 with duplicates)
2018-06-20 10:27:20,311 - compiling model architecture...
2018-06-20 10:27:21,473 - preprocessing training data...
2018-06-20 10:27:21,473 - loading corpus...
2018-06-20 10:27:21,667 - generating candidates with 0 workers...
2018-06-20 10:27:42,750 - generated 26308 pair-wise samples (71125 with duplicates)
2018-06-20 10:27:42,762 - training CNN...
2018-06-20 10:28:00,728 - Ranking accuracy: 0.593393
2018-06-20 10:28:07,438 - Ranking accuracy: 0.6277
2018-06-20 10:28:13,852 - Ranking accuracy: 0.631512
2018-06-20 10:28:20,425 - Ranking accuracy: 0.635324
2018-06-20 10:28:27,022 - Ranking accuracy: 0.655654
2018-06-20 10:28:33,199 - Ranking accuracy: 0.674714
2018-06-20 10:28:39,943 - Ranking accuracy: 0.682338
2018-06-20 10:28:46,831 - Ranking accuracy: 0.710292
2018-06-20 10:28:53,180 - Ranking accuracy: 0.722999
2018-06-20 10:28:59,295 - Ranking accuracy: 0.736976
2018-06-20 10:29:06,193 - Ranking accuracy: 0.739517
2018-06-20 10:29:12,656 - Ranking accuracy: 0.740788
2018-06-20 10:29:19,170 - Ranking accuracy: 0.749682
2018-06-20 10:29:25,554 - Ranking accuracy: 0.749682
2018-06-20 10:29:31,515 - Ranking accuracy: 0.753494
2018-06-20 10:29:38,366 - Ranking accuracy: 0.752224
2018-06-20 10:29:45,021 - Ranking accuracy: 0.756036
2018-06-20 10:29:51,503 - Ranking accuracy: 0.753494
2018-06-20 10:29:57,791 - Ranking accuracy: 0.754765
2018-06-20 10:29:57,791 - Epoch 00019: early stopping
2018-06-20 10:29:57,792 - done training.
6 changes: 3 additions & 3 deletions results
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
accuracy 0.7395171537484116
correct 582
accuracy 0.7560355781448539
correct 595
total 787
unreachable 129
nocandidates 10
ambiguous 6
ambiguous 2

0 comments on commit 376b1e1

Please sign in to comment.