-
parsers:
- docx
- pptx and other files
-
more docs
- I've never done any file I/O intensive project.
- I'm really bad at organizing files, so I've files everywhere on my system.
- And with
Calsen
what I aim is to be able to search for a function name that I remembered in a file and thenCalsen
would 'seamlessly' find the source file that contains it.
- And with
- It's fun to do something like this.
- And I can learn a lot of things from this.
- clone the repo and cd into calsen.
git clone --depth=1 https://github.com/Adwaith-Rajesh/calsen.git
cd calsen
- dependencies (I've plans to make this optional #2)
apt install libmagic-dev
Calsen
makes use of nobuild as it's build system. To compile run the following commands
gcc -o nobuild ./nobuild.c
./nobuild --release
ln -s ./build/bin/calsen ./calsen
To index the required directories run.
./calsen reindex --dir path/to/dir/1 --dir path/to/dir/2 -o sample.index
Use
--verbose
to get additional output
This will create a .index
file that Calsen will use during the search process.
Inorder to search through the indexed file you can use the following command.
./calsen search -i sample.index -q 'search query'
Calsen will find all the files that matches the "search query" and arranges them in descending of relevancy
- To get top N files
./calsen search -i sample.index -q 'search query' -n 10
Use
--verbose
to get the calculated TF-IDF score for each file