Skip to content

Commit

Permalink
add an input node for token overlap (data)
Browse files Browse the repository at this point in the history
  • Loading branch information
Lenz Furrer committed Jul 31, 2018
1 parent c960296 commit b8252c1
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 41 deletions.
2 changes: 1 addition & 1 deletion config
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ embedding_fn = ${rootpath}/data/embeddings/bpe_vectors_10000_50_w2v.txt
trainable = False

[rank]
embeddings = ["emb"]
embeddings = ["emb", "emb_sub"]
n_kernels = 50
filter_width = 3
activation = tanh
Expand Down
81 changes: 43 additions & 38 deletions log
Original file line number Diff line number Diff line change
@@ -1,38 +1,43 @@
2018-07-31 10:39:37,251 - 'pattern' package not found; tag filters are not available for English
2018-07-31 10:39:37,257 - loading terminology...
2018-07-31 10:39:37,473 - loading pretrained embeddings...
2018-07-31 10:39:37,473 - loading Word2VecKeyedVectors object from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 10:39:44,338 - loading vectors from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv.vectors.npy with mmap=r
2018-07-31 10:39:44,440 - setting ignored attribute vectors_norm to None
2018-07-31 10:39:44,440 - loaded /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 10:39:47,953 - loading vectorizer...
2018-07-31 10:39:48,715 - loading candidate generator...
2018-07-31 10:40:00,466 - loading pretrained embeddings...
2018-07-31 10:40:00,466 - loading Word2VecKeyedVectors object from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 10:40:06,688 - loading vectors from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv.vectors.npy with mmap=r
2018-07-31 10:40:06,724 - setting ignored attribute vectors_norm to None
2018-07-31 10:40:06,727 - loaded /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 10:41:30,919 - loading vectorizer...
2018-07-31 10:41:42,594 - preprocessing validation data...
2018-07-31 10:41:42,594 - loading corpus...
2018-07-31 10:41:42,603 - generating candidates with 0 workers...
2018-07-31 10:41:47,888 - generated 10888 pair-wise samples (23018 with duplicates)
2018-07-31 10:41:47,892 - compiling model architecture...
2018-07-31 10:42:00,214 - preprocessing training data...
2018-07-31 10:42:00,215 - loading corpus...
2018-07-31 10:42:00,286 - generating candidates with 0 workers...
2018-07-31 10:42:26,589 - generated 50513 pair-wise samples (147414 with duplicates)
2018-07-31 10:42:26,608 - training CNN...
2018-07-31 10:43:39,606 - Ranking accuracy: 0.748412
2018-07-31 10:44:54,632 - Ranking accuracy: 0.775095
2018-07-31 10:46:10,977 - Ranking accuracy: 0.768742
2018-07-31 10:47:21,665 - Ranking accuracy: 0.778907
2018-07-31 10:48:36,318 - Ranking accuracy: 0.789072
2018-07-31 10:49:51,169 - Ranking accuracy: 0.773825
2018-07-31 10:51:02,169 - Ranking accuracy: 0.782719
2018-07-31 10:51:02,169 - Epoch 00007: early stopping
2018-07-31 10:51:02,170 - done training.
2018-07-31 10:51:02,180 - load best model...
2018-07-31 10:51:13,752 - predict scores for validation data...
2018-07-31 10:51:17,237 - evaluate and/or serialize...
2018-07-31 10:51:17,292 - done.
2018-07-31 12:08:47,074 - 'pattern' package not found; tag filters are not available for English
2018-07-31 12:08:47,082 - loading terminology...
2018-07-31 12:08:47,293 - loading pretrained embeddings...
2018-07-31 12:08:47,293 - loading Word2VecKeyedVectors object from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 12:08:54,065 - loading vectors from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv.vectors.npy with mmap=r
2018-07-31 12:08:54,165 - setting ignored attribute vectors_norm to None
2018-07-31 12:08:54,166 - loaded /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 12:08:57,717 - loading vectorizer...
2018-07-31 12:08:58,484 - loading pretrained embeddings...
2018-07-31 12:08:58,484 - loading projection weights from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-07-31 12:08:59,159 - loaded (10257, 50) matrix from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/bpe_vectors_10000_50_w2v.txt
2018-07-31 12:08:59,164 - loading vectorizer...
2018-07-31 12:09:00,253 - loading candidate generator...
2018-07-31 12:09:12,776 - loading pretrained embeddings...
2018-07-31 12:09:12,776 - loading Word2VecKeyedVectors object from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 12:09:18,775 - loading vectors from /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv.vectors.npy with mmap=r
2018-07-31 12:09:18,812 - setting ignored attribute vectors_norm to None
2018-07-31 12:09:18,816 - loaded /mnt/storage/karr/users/furrer/prlnk/data/embeddings/wvec_200_win-30_chiu-et-al.kv
2018-07-31 12:10:43,746 - loading vectorizer...
2018-07-31 12:10:55,774 - preprocessing validation data...
2018-07-31 12:10:55,774 - loading corpus...
2018-07-31 12:10:55,784 - generating candidates with 0 workers...
2018-07-31 12:11:02,715 - generated 10888 pair-wise samples (23018 with duplicates)
2018-07-31 12:11:02,720 - compiling model architecture...
2018-07-31 12:11:14,283 - preprocessing training data...
2018-07-31 12:11:14,283 - loading corpus...
2018-07-31 12:11:14,335 - generating candidates with 0 workers...
2018-07-31 12:11:45,978 - generated 50520 pair-wise samples (147490 with duplicates)
2018-07-31 12:11:46,001 - training CNN...
2018-07-31 12:13:19,217 - Ranking accuracy: 0.772554
2018-07-31 12:14:54,623 - Ranking accuracy: 0.776366
2018-07-31 12:16:33,043 - Ranking accuracy: 0.776366
2018-07-31 12:18:04,793 - Ranking accuracy: 0.78399
2018-07-31 12:19:40,505 - Ranking accuracy: 0.787802
2018-07-31 12:21:16,661 - Ranking accuracy: 0.800508
2018-07-31 12:22:52,005 - Ranking accuracy: 0.795426
2018-07-31 12:24:23,730 - Ranking accuracy: 0.781449
2018-07-31 12:24:23,730 - Epoch 00008: early stopping
2018-07-31 12:24:23,731 - done training.
2018-07-31 12:24:23,741 - load best model...
2018-07-31 12:24:36,804 - predict scores for validation data...
2018-07-31 12:24:42,674 - evaluate and/or serialize...
2018-07-31 12:24:42,729 - done.
4 changes: 2 additions & 2 deletions results
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
accuracy 0.7890724269377383
correct 621
accuracy 0.8005082592121983
correct 630
total 787
unreachable 95
nocandidates 0
Expand Down

0 comments on commit b8252c1

Please sign in to comment.