Skip to content

Implementation of Iterative Extraction method in "Probase: A Probabilistic Taxonomy for Text Understanding" (Wu et al.)

License

Notifications You must be signed in to change notification settings

coffeebeanustb/Probase

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Probase

Implementation of Iterative Extraction method in "Probase: A Probabilistic Taxonomy for Text Understanding" (Wu et al., 2012)

Usage

usage: main.py [-h] [--epsilon EPSILON]
               [--threshold_super_concept THRESHOLD_SUPER_CONCEPT]
               [--threshold_k THRESHOLD_K]
               corpus_file output_file

Generate probase

positional arguments:
  corpus_file           Path to corpus
  output_file           Path to ouput file

optional arguments:
  -h, --help            show this help message and exit
  --epsilon EPSILON     Eplison
  --threshold_super_concept THRESHOLD_SUPER_CONCEPT
                        Threshold (super-concept)
  --threshold_k THRESHOLD_K
                        Threshold (k)

Output

Superconcept followed by subconcept, followed by their appearance count in the corpus.

animal cat 100
animal dog 200
mammal cat 50
mammal dog 70

Authors

About

Implementation of Iterative Extraction method in "Probase: A Probabilistic Taxonomy for Text Understanding" (Wu et al.)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%