Skip to content

StriderJGalt/wiki-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Installation of libraries

pip install PyStemmer

To run

for splitting bz2 dump, first modify the number of pages per chunk in the code if needed and then run

$python3 dsplit.py <path_to_bz2_file>

for indexing

$ python3 indexer.py <path_to_wiki_dump.bz2> <path_to_inverted_index_dir>

for merging index

$ for i in [a..z];
do 
	python3 merge_index.py $i
done

for merging page_info

$python3 merge_page_info.py

for searching

$ python3 search.py <path_to_input_queries_file>

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages