Skip to content

Commit

Permalink
feat: Implement user input handling for codebase and output directories
Browse files Browse the repository at this point in the history
- Add command-line interface (CLI) to prompt user for codebase and output directory paths
- Validate user input and handle directory creation if necessary
- Update .gitignore to exclude .venv and __pycache__ directories
- Optimize prompt in docs/Prompts.md for concise information extraction
- Remove outdated execution flow details from docs/ExecutionFlow.md
  • Loading branch information
PriNova committed May 2, 2024
1 parent 94508b2 commit 6c0ff61
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 47 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.venv
**/__pycache__
45 changes: 0 additions & 45 deletions docs/ExecutionFlow.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,3 @@
1. User Input:
- Prompt the user to provide the directory path of the codebase they want to analyze.

2. Codebase Traversal:
- Recursively traverse the provided codebase directory and its subdirectories.
- Identify and collect all relevant source code files based on their file extensions.

3. Code Analysis:
- For each source code file:
- Read the file contents.
- Extract relevant information such as module names, class names, function names, and their respective descriptions or comments.
- Identify dependencies, libraries, and frameworks used in the file.
- Store the extracted information in a structured format (e.g., a dictionary or a custom data structure).

4. LLM Interaction:
- For each module, class, or function:
- Prepare a prompt for the LLM, including the extracted information and any relevant context.
- Send the prompt to the LLM via the OpenAI API.
- Receive the LLM's response, which should provide a concise description or summary of the module, class, or function.
- Store the LLM-generated descriptions alongside the corresponding code elements.

5. Embedding Generation:
- For each module, class, or function:
- Generate a vector embedding using the extracted information and the LLM-generated description.
- Store the vector embedding in ChromaDB along with relevant metadata (e.g., file path, module name, class name, function name).

6. Report Generation:
- High-level Overview:
- Retrieve the stored information and embeddings from ChromaDB.
- Generate a high-level overview of the codebase's architecture and structure based on the collected information.
- Include a summary of the modules, classes, and functions, along with their descriptions.
- Detailed Reports:
- Generate separate reports for each module, class, or function, providing more detailed information and descriptions.
- Include information about dependencies, libraries, and frameworks used.
- Save the generated reports in Markdown format within the codebase directory or a specified output directory.

7. Error Handling:
- Implement error handling mechanisms to gracefully handle any exceptions or errors that may occur during the execution of the program.
- Display appropriate error messages to the user and terminate the program if necessary.

8. CLI Interface:
- Implement a command-line interface (CLI) that allows users to specify the codebase directory they want to analyze.
- Provide options for generating different types of reports (e.g., high-level overview, detailed reports) based on user preferences.


**Hierarchical Execution Flow**

1. User Input:
Expand Down
4 changes: 2 additions & 2 deletions docs/Prompts.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@
- "deployment_instructions"
- "additional_resources"
2. For each key, provide the corresponding information extracted from the documentation.
2. For each key, provide the corresponding information extracted from the documentation very briefly.
3. If any information is missing or couldn't be extracted, set the value of the corresponding key to "UNKNOWN".
3. If any information is missing, couldn't be extracted or is not known, set the value of the corresponding key to "UNKNOWN".
4. Ensure that the JSON object is well-formatted, with proper indentation and syntax.
Expand Down
46 changes: 46 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
""" 1. The directory path of the codebase to be analyzed
2. The output directory for the generated reports
To implement this step, we'll break it down into smaller tasks:
1. Create a command-line interface (CLI) for the program
2. Prompt the user to enter the codebase directory path
3. Validate the provided codebase directory path
4. Prompt the user to enter the output directory path
5. Validate the provided output directory path
6. Store the user input for later use in the program """

import argparse
import os

def validate_codebase_dir(codebase_dir):
if not os.path.exists(codebase_dir):
raise argparse.ArgumentTypeError(f"Codebase directory '{codebase_dir}' does not exist.")

git_dir = os.path.join(codebase_dir, '.git')
if not os.path.isdir(git_dir):
raise argparse.ArgumentTypeError(f"Codebase directory '{codebase_dir}' is not a local GitHub repository.")

return codebase_dir

def main():
# Create a command-line interface (CLI) for the program
parser = argparse.ArgumentParser(description='CodyArchitect')
parser.add_argument('codebase_dir', type=validate_codebase_dir, help='Path to the codebase directory')
parser.add_argument('--output_dir', '-o', help='Path to the output directory for generated reports')

# Prompt the user to enter the codebase directory path
args = parser.parse_args()

# Store the user input for later use in the program
codebase_dir = args.codebase_dir
output_dir = args.output_dir

# If the user did not provide an output directory path, create one in the codebase directory
if not output_dir:
output_dir = os.path.join(codebase_dir, '.codyarchitect')
if not os.path.exists(output_dir):
os.makedirs(output_dir)

if __name__ == '__main__':
main()

0 comments on commit 6c0ff61

Please sign in to comment.