-
Notifications
You must be signed in to change notification settings - Fork 165
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Adding github assistant code * fmt * formating evaluation.py * lint * formating second attempt * remoivng extra spaces * formating index.py * formating file_summary.append * formating parse_document function * thrid attempt of fixing index.py * lint 4x * removing extra lines * Adding evaluation-result.txt * adding README.md
- Loading branch information
Showing
6 changed files
with
580 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# GitHub Assistant | ||
|
||
Easily ask questions about your GitHub repository using RAG and Elasticsearch as a Vector database. | ||
|
||
### How to use this code | ||
|
||
1. Install Required Libraries: | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
2. Set Up Environment Variables | ||
`GITHUB_TOKEN`, `GITHUB_OWNER`, `GITHUB_REPO`, `GITHUB_BRANCH`, `ELASTIC_CLOUD_ID`, `ELASTIC_USER`, `ELASTIC_PASSWORD`, `ELASTIC_INDEX`, `OPENAI_API_KEY` | ||
|
||
3. Index your data and create the embeddings by running: | ||
|
||
```bash | ||
python index.py | ||
``` | ||
|
||
An Elasticsearch index will be generated, housing the embeddings. You can then connect to your ESS deployment and run search query against the index, you will see a new field named embeddings. | ||
|
||
4. Ask questions about your codebase by running: | ||
|
||
```bash | ||
python query.py | ||
``` |
90 changes: 90 additions & 0 deletions
90
supporting-blog-content/github-assistant/evaluation-result.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
``` | ||
Number of documents loaded: 5 | ||
\All available questions generated: | ||
0. What is the purpose of chunking monitors in the updated push command as mentioned in the changelog? | ||
1. How does the changelog describe the improvement made to the performance of the push command? | ||
2. What new feature is added to the synthetics project when it is created via the `init` command? | ||
3. According to the changelog, what is the file size of the CHANGELOG.md document? | ||
4. On what date was the CHANGELOG.md file last modified? | ||
5. What is the significance of the example lightweight monitor yaml file mentioned in the changelog? | ||
6. How might the changes described in the changelog impact the workflow of users creating or updating monitors? | ||
7. What is the file path where the CHANGELOG.md document is located? | ||
8. Can you identify the issue numbers associated with the changes mentioned in the changelog? | ||
9. What is the creation date of the CHANGELOG.md file as per the context information? | ||
10. What type of file is the document described in the context information? | ||
11. On what date was the CHANGELOG.md file last modified? | ||
12. What is the file size of the CHANGELOG.md document? | ||
13. Identify one of the bug fixes mentioned in the CHANGELOG.md file. | ||
14. What command is referenced in the context of creating new synthetics projects? | ||
15. How does the CHANGELOG.md file address the issue of varying NDJSON chunked response sizes? | ||
16. What is the significance of the number #680 in the context of the document? | ||
17. What problem is addressed by skipping the addition of empty values for locations? | ||
18. How many bug fixes are explicitly mentioned in the provided context? | ||
19. What is the file path of the CHANGELOG.md document? | ||
20. What is the file path of the document being referenced in the context information? | ||
... | ||
|
||
Generated questions: | ||
1. What command is referenced in relation to the bug fix in the CHANGELOG.md? | ||
2. On what date was the CHANGELOG.md file created? | ||
3. What is the primary purpose of the document based on the context provided? | ||
|
||
Total number of questions generated: 3 | ||
|
||
Processing Question 1 of 3: | ||
|
||
Evaluation Result: | ||
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback | | ||
+===================================================+=================================================+====================================================+======================+======================+===================+==================+==================+ | ||
| What command is referenced in relation to the bug | The `init` command is referenced in relation to | Bug Fixes | Pass | YES | 1 | Pass | YES | | ||
| fix in the CHANGELOG.md? | the bug fix in the CHANGELOG.md. | | | | | | | | ||
| | | | | | | | | | ||
| | | - Pick the correct loader when bundling TypeScript | | | | | | | ||
| | | or JavaScript journey files | | | | | | | ||
| | | | | | | | | | ||
| | | during push command #626 | | | | | | | ||
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
|
||
Processing Question 2 of 3: | ||
|
||
Evaluation Result: | ||
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback | | ||
+=================================================+================================================+==============================+======================+======================+===================+==================+==================+ | ||
| On what date was the CHANGELOG.md file created? | The date mentioned in the CHANGELOG.md file is | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES | | ||
| | November 2, 2022. | | | | | | | | ||
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
|
||
Processing Question 3 of 3: | ||
|
||
Evaluation Result: | ||
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback | | ||
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+ | ||
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES | | ||
| on the context provided? | a changelog detailing the features and | | | | | | | | ||
| | improvements made in version 1.0.0-beta-38 of a | | | | | | | | ||
| | software project. It highlights specific | | | | | | | | ||
| | enhancements such as improved validation for | | | | | | | | ||
| | monitor schedules and an enhanced push command | | | | | | | | ||
| | experience. | | | | | | | | ||
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+ | ||
(clean_env) (base) framsouza@Frams-MacBook-Pro-2 git-assistant % | ||
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+------+------------------+ | ||
|
||
Processing Question 3 of 3: | ||
|
||
Evaluation Result: | ||
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+ | ||
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |Response | Faith Feedback | | ||
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+===========+==================+ | ||
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES | | YES | | ||
| on the context provided? | a changelog detailing the features and | | | | | | | | | | ||
| | improvements made in version 1.0.0-beta-38 of a | | | | | | | | | | ||
| | software project. It highlights specific | | | | | | | | | | ||
| | enhancements such as improved validation for | | | | | | | | | | ||
| | monitor schedules and an enhanced push command | | | | | | | | | | ||
| | experience. | | | | | | | | | | ||
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+ | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,197 @@ | ||
import logging | ||
import sys | ||
import os | ||
import pandas as pd | ||
from dotenv import load_dotenv | ||
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Response | ||
from llama_index.core.evaluation import ( | ||
DatasetGenerator, | ||
RelevancyEvaluator, | ||
FaithfulnessEvaluator, | ||
EvaluationResult, | ||
) | ||
from llama_index.llms.openai import OpenAI | ||
from tabulate import tabulate | ||
import textwrap | ||
import argparse | ||
import traceback | ||
from httpx import ReadTimeout | ||
|
||
logging.basicConfig(stream=sys.stdout, level=logging.INFO) | ||
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout)) | ||
|
||
parser = argparse.ArgumentParser( | ||
description="Process documents and questions for evaluation." | ||
) | ||
parser.add_argument( | ||
"--num_documents", | ||
type=int, | ||
default=None, | ||
help="Number of documents to process (default: all)", | ||
) | ||
parser.add_argument( | ||
"--skip_documents", | ||
type=int, | ||
default=0, | ||
help="Number of documents to skip at the beginning (default: 0)", | ||
) | ||
parser.add_argument( | ||
"--num_questions", | ||
type=int, | ||
default=None, | ||
help="Number of questions to process (default: all)", | ||
) | ||
parser.add_argument( | ||
"--skip_questions", | ||
type=int, | ||
default=0, | ||
help="Number of questions to skip at the beginning (default: 0)", | ||
) | ||
parser.add_argument( | ||
"--process_last_questions", | ||
action="store_true", | ||
help="Process last N questions instead of first N", | ||
) | ||
args = parser.parse_args() | ||
|
||
load_dotenv(".env") | ||
|
||
reader = SimpleDirectoryReader("/tmp/elastic/production-readiness-review") | ||
documents = reader.load_data() | ||
print(f"First document: {documents[0].text}") | ||
print(f"Second document: {documents[1].text}") | ||
print(f"Thrid document: {documents[2].text}") | ||
|
||
|
||
if args.skip_documents > 0: | ||
documents = documents[args.skip_documents :] | ||
|
||
if args.num_documents is not None: | ||
documents = documents[: args.num_documents] | ||
|
||
print(f"Number of documents loaded: {len(documents)}") | ||
|
||
llm = OpenAI(model="gpt-4o", request_timeout=120) | ||
|
||
data_generator = DatasetGenerator.from_documents(documents, llm=llm) | ||
|
||
try: | ||
eval_questions = data_generator.generate_questions_from_nodes() | ||
if isinstance(eval_questions, str): | ||
eval_questions_list = eval_questions.strip().split("\n") | ||
else: | ||
eval_questions_list = eval_questions | ||
eval_questions_list = [q for q in eval_questions_list if q.strip()] | ||
|
||
if args.skip_questions > 0: | ||
eval_questions_list = eval_questions_list[args.skip_questions :] | ||
|
||
if args.num_questions is not None: | ||
if args.process_last_questions: | ||
eval_questions_list = eval_questions_list[-args.num_questions :] | ||
else: | ||
eval_questions_list = eval_questions_list[: args.num_questions] | ||
|
||
print("\All available questions generated:") | ||
for idx, q in enumerate(eval_questions): | ||
print(f"{idx}. {q}") | ||
|
||
print("\nGenerated questions:") | ||
for idx, q in enumerate(eval_questions_list, start=1): | ||
print(f"{idx}. {q}") | ||
except ReadTimeout as e: | ||
print( | ||
"Request to Ollama timed out during question generation. Please check the server or increase the timeout duration." | ||
) | ||
traceback.print_exc() | ||
sys.exit(1) | ||
except Exception as e: | ||
print(f"An error occurred while generating questions: {e}") | ||
traceback.print_exc() | ||
sys.exit(1) | ||
|
||
print(f"\nTotal number of questions generated: {len(eval_questions_list)}") | ||
|
||
evaluator_relevancy = RelevancyEvaluator(llm=llm) | ||
evaluator_faith = FaithfulnessEvaluator(llm=llm) | ||
|
||
vector_index = VectorStoreIndex.from_documents(documents) | ||
|
||
|
||
def display_eval_df( | ||
query: str, | ||
response: Response, | ||
eval_result_relevancy: EvaluationResult, | ||
eval_result_faith: EvaluationResult, | ||
) -> None: | ||
relevancy_feedback = getattr(eval_result_relevancy, "feedback", "") | ||
relevancy_passing = getattr(eval_result_relevancy, "passing", False) | ||
relevancy_passing_str = "Pass" if relevancy_passing else "Fail" | ||
|
||
relevancy_score = 1.0 if relevancy_passing else 0.0 | ||
|
||
faithfulness_feedback = getattr(eval_result_faith, "feedback", "") | ||
faithfulness_passing_bool = getattr(eval_result_faith, "passing", False) | ||
faithfulness_passing = "Pass" if faithfulness_passing_bool else "Fail" | ||
|
||
def wrap_text(text, width=50): | ||
if text is None: | ||
return "" | ||
text = str(text) | ||
text = text.replace("\r", "") | ||
lines = text.split("\n") | ||
wrapped_lines = [] | ||
for line in lines: | ||
wrapped_lines.extend(textwrap.wrap(line, width=width)) | ||
wrapped_lines.append("") | ||
return "\n".join(wrapped_lines) | ||
|
||
if response.source_nodes: | ||
source_content = wrap_text(response.source_nodes[0].node.get_content()) | ||
else: | ||
source_content = "" | ||
|
||
eval_data = { | ||
"Query": wrap_text(query), | ||
"Response": wrap_text(str(response)), | ||
"Source": source_content, | ||
"Relevancy Response": relevancy_passing_str, | ||
"Relevancy Feedback": wrap_text(relevancy_feedback), | ||
"Relevancy Score": wrap_text(str(relevancy_score)), | ||
"Faith Response": faithfulness_passing, | ||
"Faith Feedback": wrap_text(faithfulness_feedback), | ||
} | ||
|
||
eval_df = pd.DataFrame([eval_data]) | ||
|
||
print("\nEvaluation Result:") | ||
print( | ||
tabulate( | ||
eval_df, headers="keys", tablefmt="grid", showindex=False, stralign="left" | ||
) | ||
) | ||
|
||
|
||
query_engine = vector_index.as_query_engine(llm=llm) | ||
|
||
total_questions = len(eval_questions_list) | ||
for idx, question in enumerate(eval_questions_list, start=1): | ||
try: | ||
response_vector = query_engine.query(question) | ||
eval_result_relevancy = evaluator_relevancy.evaluate_response( | ||
query=question, response=response_vector | ||
) | ||
eval_result_faith = evaluator_faith.evaluate_response(response=response_vector) | ||
|
||
print(f"\nProcessing Question {idx} of {total_questions}:") | ||
display_eval_df( | ||
question, response_vector, eval_result_relevancy, eval_result_faith | ||
) | ||
except ReadTimeout as e: | ||
print(f"Request to OpenAI timed out while processing question {idx}.") | ||
traceback.print_exc() | ||
continue | ||
except Exception as e: | ||
print(f"An error occurred while processing question {idx}: {e}") | ||
traceback.print_exc() | ||
continue |
Oops, something went wrong.