Follow the steps below to set up and run the project:
- Clone the Repository from terminal
git clone <repository_url>
- Move to the pulled folder from terminal
cd valis_kodune
- Activate your desired conda environment from terminal, where you want to use the tool
conda activate your_environment_name
- Install the requirements needed to run the tool from terminal
pip install -r requirements.txt
- Run the Script
doc_query_tool
- Choose an Optionl After running, the program will prompt you to choose from three options:
- Option 1: Query the indexed documents.
- Option 2: Index PDF documents.
- Option 3: Index manually entered text.
Enter the corresponding number (1, 2, or 3) to proceed.
- Query the Index
- If you’ve already indexed documents, choose this option to ask questions.
- Enter your question (in English) when prompted.
- Set a similarity threshold (default: 1.0, range: 0.0 - 2.0). A lower threshold gives broader matches, while a higher threshold gives stricter matches.
- View the results with relevant snippets and metadata.
- Add Documents to the Index
- Select PDF files to index via a file dialog.
- The program extracts and processes the content for efficient searching.
- Once complete, you can start querying the index.
- Input Text Manually
- Type or paste text into the console, pressing "Enter" twice to finish input.
- The entered text is indexed and made available for querying.
When querying, the program returns results in the following format:
- Score
- Indicates how closely the result matches your query. Higher scores signify better matches.
- Snippet
- A contextual portion of the document or text where your query matches.
- The relevant section is highlighted for clarity (e.g., ** matched text **).
- Metadata
- Provides additional information about the document, such as its origin or section.
Result 1 (Score: 1.85):
Snippet: The **quick brown fox** jumps over the lazy dog...
Metadata: No metadata provided.
Result 2 (Score: 1.50):
Snippet: In the story, the **quick fox** symbolizes agility and cunning...
Metadata: Chapter 2, Page 15.
If no results meet the threshold, you’ll see:
No satisfactory results found. Try rephrasing your question or using more specific terms.