Skip to content

A General Purpose Tagger for POS Tagging, NER Tagging, and Chunking.

Notifications You must be signed in to change notification settings

sfu-natlang/neural-network-tagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

neural-network-tagger

A General Purpose Tagger for POS Tagging, NER Tagging, and Chunking.

Syntaxnet

  1. Prepare WSJ Data for Part-OF-Speech Tagging a. convert to conll format https://gist.github.com/khlmnn/3cc07407a002bb1773cd b. map XPOSTAG to UPOSTAG before training using convert.py

  2. Install Syntaxnet https://github.com/tensorflow/models/tree/a9133ae914b44602c5f26afbbd7dd794ff9c6637/syntaxnet

  3. Train and test the model using taggerTrain.sh, taggerTest.sh and tagger.pbtxt

FeedForward Model

cd PATH_TO_TAGGER/src/feedforward_model

Training: python tagger_trainer.py

Evaluating: python tagger_eval.py

BiLSTM-CRF Model

Reference: https://github.com/guillaumegenthial/sequence_tagging

Mention2Vec with FF/BiLSTMs

python build_data.py python main_ff.py/main.py

Experiments

Model POS NER Chunk
Feedforword (word) 95.89 - -
Feedforword (history and spelling features) 97.31 - -
Bi-LSTM (word) 95.88 78.66 -
Bi-LSTM (Character Embedding) 97.08 78.87 -
Bi-LSTM-CRF (Character Embedding) 97.34 - -

Decoding Time (words/sec, with GPU enabled)

Model POS NER Chunk
Feedforward Model(word feature only) 11000/s - -
Feedforword Model (history spelling feature) ~8000/s - -
Bi-LSTM-CRF (word feature only) ~2000/s - -
Bi-LSTM-CRF (Character Embedding) ~1500/s - -

About

A General Purpose Tagger for POS Tagging, NER Tagging, and Chunking.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published