Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Provide traceback and function source code to LLM advice model #48

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

PCain02
Copy link
Collaborator

@PCain02 PCain02 commented Nov 10, 2024

Pull Request

**1. feat: Provide traceback and function source code to LLM advice model

2. List the names of those who contributed to the project.
@PCain02

3. Link the issue the pull request is meant to fix/resolve.

4. Add all labels that apply. (e.g., documentation, ready-for-review)

  • enhancement
  • test
  • bug

5. Describe the contents and goal of the pull request.
The goal of this PR are to provide more information to the LLM advice model. This includes the traceback and the actual source code of the function/s that failed the test.

6. Will coverage be maintained/increased?
Coverage will decrease slightly but test cases were added to compensate for that. It still remains at about 61%.

7. What operating systems has this been tested on? How were these tests conducted?
Windows 10 so please test on macOS and Linux. The tests were ran by doing poetry run task test but also by prompting the advice model and looking at the responses to see if they are more valuable than before. Now the traceback and failing function source code are given to the LLM so the responses are a lot more relevant and should not try to correct the test cases.

8. Include a code block and/or screenshots displaying the functionality of your feature, if applicable/possible.

I know this is a lot to process so these figures should help with code comprehension.
image
image
image

This is an example output:
image

@PCain02 PCain02 added enhancement New feature or request test Test cases or test running labels Nov 10, 2024
@PCain02 PCain02 added the bug Something isn't working label Nov 10, 2024
@gkapfham
Copy link
Collaborator

Hi @PCain02 it looks like this branch now has conflicts that need to be resolved. Can you please investigate this issue when you have time?

@gkapfham
Copy link
Collaborator

Also, @PCain02 can you give an example of a command-line that can be run and a repository on which it can be run so that we can all quickly test this feature?

@PCain02
Copy link
Collaborator Author

PCain02 commented Nov 13, 2024

Hi @PCain02 it looks like this branch now has conflicts that need to be resolved. Can you please investigate this issue when you have time?

Oh thanks for pointing that out I can resolve those ASAP!

@PCain02
Copy link
Collaborator Author

PCain02 commented Nov 13, 2024

Also, @PCain02 can you give an example of a command-line that can be run and a repository on which it can be run so that we can all quickly test this feature?

Absolutely, the command I have been using is poetry run execexam . ./tests/test_question_one.py --report trace --report status --report failure --report code --report setup --advice-method apiserver --advice-model anthropic/claude-3-haiku-20240307 --advice-server https://execexamadviser.fly.dev/ --report advice --fancy --debug but the model and server can be subbed out for whatever you would like to test it with. The way the advice works the prompt remains the same no matter the model.

@PCain02
Copy link
Collaborator Author

PCain02 commented Nov 13, 2024

Oh also the toml version will need changed before merging. I am not sure what people decided on class about merge order.

@PCain02
Copy link
Collaborator Author

PCain02 commented Nov 13, 2024

PR #41 for the Windows spacing bug should get merged before this PR because I made this with that fix in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request test Test cases or test running
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants