Skip to content

Latest commit

 

History

History
17 lines (9 loc) · 534 Bytes

README.md

File metadata and controls

17 lines (9 loc) · 534 Bytes

Seminar Extraction

Main: seminarextract.py

Please note, to run the program you must do the following:

  • Replace "dir" and "dataDir" variables in seminarextract.py to point to stanford-ner and nltk-data respectively

  • Download GoogleNews-vectors-negative300.bin and place in the nltk-data folder

  • Install with pip: gensim, nltk, pprint, sner

  • Ensure stopwords downloaded from nltk.corpus

  • Move seminar files required into the "seminars_training/training" folder

  • Results will show in "seminars_training/training/results"