Skip to content
forked from Adaxry/GCDT

Code for the paper: GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

License

Notifications You must be signed in to change notification settings

Stella-S-Yan/GCDT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

Contents

Introduction

The code of our proposed GCDT, which deepens the state transition path at each position in a sentence, and further assign every token with a global representation learned from the entire sentence. [paper]. The implementation is based on THUMT.

Usage

  • Trim Glove
sh trim_glove.sh path_to_glove

path_to_glove is the path of your decompressed Glove embedding.

  • Training
sh train.sh task_name

task_name is the name of tasks between ner and chunking.

  • Evaluation and Testing
sh test.sh task_name test_type

Set test_type to testa for evaluation and testb for testing. Please note there is no evaluation set for the chunking task.

Requirements

  • tensorflow 1.12
  • python 3.5

Citation

Please cite the following paper if you use the code:

@InProceedings{Liu:19,
  author    = {Yijin Liu, Fandong Meng, Jinchao Zhang, Jinan Xu, Yufeng Chen and Jie Zhou},
  title     = {GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling},
  booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
  year      = {2019}
}

FAQ

  • Why not evaluate along with training?

    For training efficiency, we firstly train a model for specified steps, and restore checkpoints for evaluation and testing. For the CoNLL03, we compute the score on the test set at the best-performing checkpoints on the evaluation set. For the CoNLL2000, we compute the score on the test set directly.

  • How to get BERT embeddings?

    We provide a simple tool to gennerate the BERT embedding for sequence labeling tasks. And then assign bert_emb_path with correct path and set use_bert to True in train.sh.

About

Code for the paper: GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PLSQL 76.9%
  • CWeb 14.2%
  • Python 7.7%
  • Perl 1.1%
  • Other 0.1%