Added code for JSON oriented model approach #4

deepnayak · 2024-06-17T19:23:43Z

Summary by Sourcery

This pull request introduces a JSON-oriented model approach by adding a new script for generating query datasets, a new Flask application for handling user queries, and enhancing the existing index and query engine functionalities. It also removes obsolete files to clean up the codebase.

New Features:
- Introduced a new script generate_query_dataset.py to generate sample queries and their corresponding JSON outputs and API queries.
- Added a new Flask application in src/app.py to handle user queries and interact with the query engine.
Enhancements:
- Updated the build_index function to use a simplified directory name and a different method for creating the index.
- Modified the load_index function to load queries from a JSON file and use a question store for query retrieval.
- Enhanced the GoaTAPIQueryEngine class to include a question_store and updated the custom_query method to format the context string using JSON data from the question store.
Chores:
- Removed obsolete files: query_reformulation.py, rich_queries/queryV1.json, app.py, and prompt.py.

sourcery-ai · 2024-06-17T19:23:49Z

Reviewer's Guide by Sourcery

This pull request introduces a JSON-oriented model approach for handling queries. The main changes include updating the embedding and LLM models, refactoring the indexing and query engine to use JSON-based queries, and adding new scripts and a Flask application for generating and handling queries. Obsolete files related to the old query reformulation approach have been removed.

File-Level Changes

Files	Changes
`index.py` `query_engine.py`	Refactored the indexing and query engine to use JSON-based queries and updated the LLM model.
`src/scripts/generate_query_dataset.py` `src/app.py`	Added new scripts and Flask application for generating and handling queries.
`query_reformulation.py` `rich_queries/queryV1.json` `app.py` `prompt.py`	Removed obsolete files related to the old query reformulation approach.

Tips

Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
You can always contact us if you have any questions or feedback.

sourcery-ai

Hey @deepnayak - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 2 issues found
🟡 Security: 1 issue found
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

sourcery-ai · 2024-06-17T19:25:17Z

src/scripts/generate_query_dataset.py

@@ -0,0 +1,136 @@
+import json


suggestion: New script for generating query dataset.

Consider adding docstrings to the functions in this script to improve readability and maintainability.

src/app.py

sourcery-ai · 2024-06-17T19:25:18Z

src/prompt.py

@@ -0,0 +1,32 @@
+from llama_index.core import PromptTemplate


suggestion: New prompt template for query parsing.

Ensure that the prompt template is comprehensive enough to handle a wide range of queries and edge cases.

Suggested change

from llama_index.core import PromptTemplate

from llama_index.core import PromptTemplate

prompt_template = PromptTemplate(

template="Please provide a detailed response to the following query: {query}. Ensure your response covers all possible edge cases and scenarios.",

variables=["query"]

)

Refined logic for time based queries

rjchallis · 2024-06-19T12:52:25Z

I've not started the review yet as when I try to run the code using the instructions in INSTALL.md (from the src directory) I get an error in the browser and httpx.HTTPStatusError: Client error '404 Not Found' for url 'http://127.0.0.1:11434/api/generate' on the server. Swapping to the main branch with the same ollama server running the code works so I'm not clear why the 404. Any thoughts @deepnayak?

rjchallis · 2024-06-19T13:18:34Z

I'm not clear why the 404

no problem, I see the model is now codellama

rjchallis · 2024-06-19T13:12:41Z

.github/workflows/flake8.yml

@@ -20,7 +20,7 @@ jobs:
      - name: flake8 Lint
        uses: TrueBrain/actions-flake8@v2
        with:
-          ignore: E203,E701
+          ignore: E203,E701,W503


I think W504 and go in this list too. I hadn't spotted the difference between ignore and extend-ignore so should list both W503 and W504 to match setup.cfg

rjchallis · 2024-06-19T14:21:25Z

src/app.py

+
+    query_string = " AND ".join(params)
+    return (
+        base_url + endpoint + "query=" + urllib.parse.quote_plus(query_string) + suffix


for API spaces need to be escaped as %20, not +

rjchallis · 2024-06-19T14:26:40Z

src/app.py

+    )
+
+    if json_output["intent"] == "count":
+        endpoint = "count?"


for now this could be endpoint = api/v2/count? to generate valid links

rjchallis · 2024-06-19T14:30:53Z

src/templates/chat.html

-                var messageElement = $('<div class="message ' + sender + '-message"></div>').text(message);
+                var messageElement;
+                if (sender == 'bot' && message != 'Bot is typing...' && message != 'Sorry, something went wrong.')
+                    messageElement = $('<div class="message ' + sender + '-message"><a href=' + message + ' target="_blank">GoaT Link!</a></div>');


this should include a JSON representation for debugging

rjchallis · 2024-06-19T14:38:10Z

src/scripts/generate_query_dataset.py

Good for now - for next iteration will want to include:

plurals

name class (common/ scientific)

rjchallis · 2024-06-19T14:41:29Z

src/app.py

+    params = []
+
+    if "taxon" in json_output:
+        params.append(f"tax_tree(* {json_output['taxon']})")


fine for now - for next iteration use tax_name search to replace user tax name with taxon_id

rjchallis

Looks good

Added code for JSON oriented model approach

3abfcdd

sourcery-ai bot reviewed Jun 17, 2024

View reviewed changes

deepnayak added 6 commits June 18, 2024 22:31

chore: Update code formatting and editor settings

43c77e6

Refined logic for time based queries

refactor: Reorganize imports and update code formatting

b0ddec5

chore: Refactor build_index function to simplify code

2727fc8

Minor changes

c770b71

Minor changes

cd314da

Removed conflicting flake8 error

80e24dc

deepnayak requested a review from rjchallis June 18, 2024 17:25

Updated model in INSTALL.md

a7f98f7

rjchallis requested changes Jun 19, 2024

View reviewed changes

Resolved PR Comments

9a06fa5

deepnayak requested a review from rjchallis June 23, 2024 06:44

rjchallis approved these changes Jun 24, 2024

View reviewed changes

rjchallis merged commit 7f2131f into main Jun 24, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added code for JSON oriented model approach #4

Added code for JSON oriented model approach #4

deepnayak commented Jun 17, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Jun 17, 2024 •

edited

Loading

sourcery-ai bot left a comment

sourcery-ai bot Jun 17, 2024

sourcery-ai bot Jun 17, 2024

rjchallis commented Jun 19, 2024

rjchallis commented Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis Jun 19, 2024

rjchallis left a comment

-from llama_index.core import PromptTemplate
+from llama_index.core import PromptTemplate
+prompt_template = PromptTemplate(
+    template="Please provide a detailed response to the following query: {query}. Ensure your response covers all possible edge cases and scenarios.",
+    variables=["query"]
+)

Added code for JSON oriented model approach #4

Added code for JSON oriented model approach #4

Conversation

deepnayak commented Jun 17, 2024 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Jun 17, 2024 • edited Loading

Reviewer's Guide by Sourcery

File-Level Changes

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 17, 2024

Choose a reason for hiding this comment

sourcery-ai bot Jun 17, 2024

Choose a reason for hiding this comment

rjchallis commented Jun 19, 2024

rjchallis commented Jun 19, 2024

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis Jun 19, 2024

Choose a reason for hiding this comment

rjchallis left a comment

Choose a reason for hiding this comment

deepnayak commented Jun 17, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Jun 17, 2024 •

edited

Loading