Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a CASA researcher, I want to run Monte Carlo simulations of social interactions between AI agents, so I can measure the stability and variance of social rules in AI-to-AI interactions. #56

Merged
merged 1 commit into from
Nov 16, 2024

Conversation

leonvanbokhorst
Copy link
Owner

@leonvanbokhorst leonvanbokhorst commented Nov 16, 2024

Fixes #55

Summary by Sourcery

New Features:

  • Introduce a Monte Carlo simulation framework for evaluating social interactions between AI agents, focusing on the stability and variance of social rules.

… interactions between AI agents, so I can measure the stability and variance of social rules in AI-to-AI interactions.

Fixes #55
Copy link
Contributor

sourcery-ai bot commented Nov 16, 2024

Reviewer's Guide by Sourcery

This PR implements a Monte Carlo simulation framework for studying AI agent interactions in educational settings. The implementation uses async Python to manage conversations between AI agents (tutor, student, and evaluator) through the Ollama API, with comprehensive logging and experiment parameter management.

Sequence diagram for AI agent interaction in Monte Carlo simulation

sequenceDiagram
    actor Researcher
    participant StudentAI as Student_AI
    participant TutorAI as Tutor_AI
    participant EvaluatorAI as Evaluator_AI
    participant Ollama

    Researcher->>StudentAI: Send initial query
    StudentAI->>Ollama: Generate response
    Ollama-->>StudentAI: Return response
    StudentAI->>TutorAI: Send response
    TutorAI->>Ollama: Generate response
    Ollama-->>TutorAI: Return response
    TutorAI->>StudentAI: Send response
    StudentAI->>Ollama: Generate follow-up
    Ollama-->>StudentAI: Return follow-up
    StudentAI->>TutorAI: Send follow-up
    TutorAI->>Ollama: Generate final response
    Ollama-->>TutorAI: Return final response
    TutorAI->>StudentAI: Send final response

    StudentAI->>EvaluatorAI: Send conversation for evaluation
    EvaluatorAI->>Ollama: Generate evaluation
    Ollama-->>EvaluatorAI: Return evaluation
Loading

Class diagram for AI agent and experiment setup

classDiagram
    class ExperimentParams {
        float temperature
        float top_p
        string model_name
        int num_iterations
    }

    class LLMAgent {
        string name
        string role
        string personality
        ExperimentParams params
        List~Dict~ conversation_history
        string agent_id
        +_construct_prompt(message: str) str
        +_format_history() str
        +send_message(message: str) Dict
    }

    class CASAExperiment {
        string experiment_id
        ExperimentParams params
        LLMAgent tutor
        LLMAgent student
        LLMAgent evaluator
        +run_direct_evaluation(topic: str) Dict
        +run_indirect_evaluation(topic: str) Dict
        +run_monte_carlo(topics: List~str~) List~Dict~
        +_format_conversation_for_evaluation(conversation: List~Dict~) str
        +_save_results(results: List~Dict~, topic: str, suffix: str) void
    }

    ExperimentParams --> LLMAgent : uses
    ExperimentParams --> CASAExperiment : uses
    LLMAgent --> CASAExperiment : used by
    CASAExperiment --> LLMAgent : has
    CASAExperiment --> ExperimentParams : has
Loading

File-Level Changes

Change Details Files
Implementation of the core experiment parameter and agent management system
  • Created ExperimentParams dataclass to manage simulation parameters
  • Implemented LLMAgent class with conversation history tracking
  • Added UUID-based identification for experiments and agents
  • Implemented prompt construction and history formatting methods
casa/01_politeness.py
Development of the conversation simulation system
  • Implemented async message sending using Ollama API
  • Added comprehensive response metadata collection
  • Created direct evaluation flow between tutor and student
  • Implemented indirect evaluation with third-party evaluator assessment
casa/01_politeness.py
Implementation of Monte Carlo simulation framework
  • Added progress tracking using tqdm
  • Implemented periodic result saving
  • Created JSON-based result storage system
  • Added error handling and recovery for failed iterations
casa/01_politeness.py

Assessment against linked issues

Issue Objective Addressed Explanation
#55 Implement parameter control (temperature, top_p, model selection, iterations)
#55 Implement experimental design with multiple iterations, conversation structure, and variable recording
#55 Implement data collection and storage with specified JSON structure

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@leonvanbokhorst leonvanbokhorst self-assigned this Nov 16, 2024
@leonvanbokhorst leonvanbokhorst added the enhancement New feature or request label Nov 16, 2024
@leonvanbokhorst leonvanbokhorst added this to the Phase 1 milestone Nov 16, 2024
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider implementing proper error handling with retries for failed API calls to improve reliability during long-running simulations.
  • The current implementation stores all results in memory before saving. Consider streaming results to disk to prevent memory issues with large numbers of iterations.
Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 2 issues found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

casa/01_politeness.py Show resolved Hide resolved
casa/01_politeness.py Show resolved Hide resolved
casa/01_politeness.py Show resolved Hide resolved
casa/01_politeness.py Show resolved Hide resolved
casa/01_politeness.py Show resolved Hide resolved
@leonvanbokhorst leonvanbokhorst merged commit d436cb0 into main Nov 16, 2024
1 check failed
@leonvanbokhorst leonvanbokhorst deleted the leonvanbokhorst/issue55 branch November 16, 2024 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant