Skip to content

tanumitra/lda-implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LDA Implementation Using Gibbs Sampler:

Input arguments:
-f filename The file should have a collection of documents seperated by new line
-k number of topics
-a hyper-parameter alpha
-b hyper-parameter beta
-i number of iterations
-w top w words to be output in the topic_word distributions
-o output folder where all the output files will be saves

Output files:
Four files in the output folder : (xxx is the input filename)
=======
Four files: (xxx is the input filename)
1. doc_topic_input_xxx - document topic distributions of the entire document collection
2. topic_word_xxx - topic word distributions of the k topics
3. vocabinput_xxx - list of vocabulary terms
4. loglike-input_xxx - loglikelihood at each iteration.

Example usage:-
python lda_gibbs.py -f input.data -k 18 -a 0.05 -b 0.05 -i 100 -w 20 -o output_folder

About

lda with Gibbs Sampler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages