This is a simple code evaluator that uses a LLM to evaluate the correctness of a code snippet.
demo.mp4
- Clone this repository
- Run
pip install -r requirements.txt
- Add
.env
file with the following variables:OPENAI_API_KEY
: Your OpenAI API key
- Run
python app.py
- The server will run on
localhost:8000
- Clone the client repository and follow the instructions to run the client.
- The evaluator uses a LLM to evaluate the correctness of a code snippet. It uses the
evaluate
endpoint of the server. - The evaluator uses the
messages
field to get the conversation history between the user and the LLM. - The evaluator uses the
user_code_snippet
field to get the code snippet that the user has written. - The evaluator uses the
thread_id
field to get the thread id. - The evaluator returns a
ModelResponse
object.
- Endpoint:
POST /init/
- Description: Initializes a new conversation and returns a thread ID
- Request: No payload required
- Response:
{ "messages": Array<Message>, "final_output": { "response": string }, "thread_id": string }
Messages in the array can be of three types:
- System Message:
{"type": "system", "content": string}
- Human Message:
{"type": "human", "content": string}
- AI Message:
{"type": "ai", "content": string}
- Endpoint:
POST /evaluate/
- Description: Evaluates a code snippet and provides feedback
- Request Body:
{ "messages": Array<Message>, "user_code_snippet": string, "thread_id": string }
- Response:
{ "messages": Array<Message>, "final_output": { "status": string, "score": number, "comment": string, "hint": string }, "thread_id": string }
- Endpoint:
POST /feed-question/
- Description: Feeds a new question into the conversation
- Request Body:
{ "question": string, "messages": Array<Message>, "thread_id": string }
- Response:
{ "messages": Array<Message>, "final_output": { "response": string }, "thread_id": string }