Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI Dev Team #819

Closed
wants to merge 83 commits into from
Closed

AI Dev Team #819

wants to merge 83 commits into from

Conversation

ElishaKay
Copy link
Collaborator

@ElishaKay ElishaKay commented Sep 2, 2024

Setup:

Step 1: Generate a Github Personal Access Token

For the GITHUB_TOKEN (Personal Access Token) value, you'll need to generate a Personal Access Token:
Click on your profile picture in the top-right corner of GitHub and select "Settings".
In the left sidebar, click on "Developer settings".
Click on "Personal access tokens", then "Tokens (classic)".
Click "Generate new token" and select the appropriate scopes for your needs.
Copy the generated token.
Paste the token you generated into the value field of the new secret.
Click "Add secret" to save it.

Step 2: Set these environment variables:

OPENAI_API_KEY={Your OpenAI API Key here}
TAVILY_API_KEY={Your Tavily API Key here}

PGVECTOR_CONNECTION_STRING=postgresql://username:password...
GITHUB_TOKEN=

Step 3:

pip install -r multi_agents/requirements.txt
python -m multi_agents.dev_team.main

Status: Running to completion

@assafelovic
Copy link
Owner

This is sick @ElishaKay have to admit I'm hooked on this!

@assafelovic
Copy link
Owner

Just please try to reuse any already existing code i see a lot of duplicates!

@ElishaKay
Copy link
Collaborator Author

ElishaKay commented Sep 10, 2024

Status report:
The bot supports 1-on-1 chats.
Within discord, the user can trigger a modal by typing: "/ask" in the chat and hitting "Enter"
Here is what the form looks like

Screen Shot 2024-09-10 at 10 12 08

@ElishaKay ElishaKay changed the title AI Dev Team 🛑: AI Dev Team Sep 13, 2024
ElishaKay and others added 14 commits September 22, 2024 03:26
…js server should log errors without restarting
…nnels - also added a cool down logic so that it only advises about the /ask command every 30 minutes per channel
…me & embedding everything with metadata in gptr-compatible format
…ning to completion with relevant files fetched from gptr's __get_similar_content_by_query_with_vectorstore method
…tructure as long as PGVECTOR_CONNECTION_STRING & GITHUB_TOKEN env vars are set
@ElishaKay
Copy link
Collaborator Author

ElishaKay commented Sep 22, 2024

Status Report:

Flow:

Step 1: The GithubAgent is in charge of the first step of fetching the data from Github with an API Tool & logging the Directory Structure.

He's also in charge of saving the Github repo within a LangChain VectorStore.

Step 2: The RepoAnalyzerAgent leverages the vectorstore with the repo by running GPTResearcher like so:

   # Run GPTResearcher
   researcher = GPTResearcher(
       query=query,
       report_type="research_report",
       report_source="langchain_vectorstore",
       vector_store=vector_store,
   )

Step 3: The WebSearchAgent takes the output of the RepoAnalyyzerAgent & complements any insights from his analysis with info from the web. He runs GPTR like so:

   # Run GPTResearcher
   researcher = GPTResearcher(
       query=query,
       report_type="research_report",
       report_source="web"
   )

Step 4: The RubberDuckerAgent talks out loud about what the game plan is for answering the user. He's forced to talk through his reasoning based on the outputs of the RepoAnalyyzerAgent & WebSearchAgent

The GithubAgent now has the ability to parse & embed the entire Github Repo contents in optimized Langchain Documents format. He passes those Langchain Documents to the VectorAgent who saves them in the LangChain VectorStore.

That VectorStore is then passed into the GPTR report flow by the RepoAnalyzerAgent, like so:

researcher = GPTResearcher(
      query=query,
      report_type="research_report",
      report_source="langchain_vectorstore",
      vector_store=vector_store,
)

The resulting report is solid, but:

  • properly embedding the full github repo takes a minute or so

Areas for improvement:

  • leverage the same vector_store across reports & follow-up questions (i.e. don't re-embed the entire repo on every run)
  • add the WebSearch Agent into the mix
  • Save the file names from the results of the vector search within Langgraph State (to be leveraged by FileSearchAgent and/or RubberDuckerAgent)
  • Give the Github Agent a method to calculate the delta between commits & only re-embed the files that changed since the latest commits on a given branch

Also:

@ElishaKay
Copy link
Collaborator Author

ElishaKay commented Sep 22, 2024

Current flow

GPTR_ AI Dev Team Flow

@assafelovic
Copy link
Owner

@ElishaKay whats up brother, what's the status on this? Please also see we've massively refactored the code base so there are some conflicts here

…yncronous langchain vector store is passed into GPTR for async vector search - generated report is looking good
@ElishaKay
Copy link
Collaborator Author

ElishaKay commented Oct 6, 2024

@assafelovic

Status Update:

Langchain PGVector support is looking good for the saving & retrieving parts of the flow.

That will be an important step for long-form documents that don't need to re-embed on every run & seamless support for langgraph cloud.

Next steps are what's marked here in "Areas for improvement". I'll probably end up resolving the merge conflicts after those are implemented.

@ElishaKay ElishaKay changed the title 🛑: AI Dev Team AI Dev Team Oct 13, 2024
@ElishaKay
Copy link
Collaborator Author

This PR got too big - moving the gist of it here: #1075

We can rethink if we want a separate repo for "GPTR Plugins" - which can include the ability to ingest custom data sources (like Github Repos and others) for seamless use with GPTR

@ElishaKay ElishaKay closed this Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants