Skip to content

StephenHuo-code/AI_agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Agents

中文

A research-oriented AI Agent framework focused on practical applications and exploring scalable architecture implementation.

Features

  • 🤖 Modular Agent Architecture
  • 🛠️ Extensible Tool Integration
  • 🔄 Flexible Workflow Management
  • 📊 Performance Monitoring
  • 🧪 Convenient Experiment Support

Requirements

  • Python 3.10+
  • pip or other package managers

Installation

For Users

pip install -r requirements.txt

Agents Description

1. Simple Conversational Agent

A dialogue system based on LangChain and OpenAI, with the following main components:

  1. LLM

    • ChatOpenAI
      • OpenAI GPT model interface
      • Handles natural language generation
      • Supports model parameter configuration
  2. Agent

    • RunnableWithMessageHistory
      • Manages conversation flow
      • Maintains session state
      • Handles multi-turn dialogues
    • chain
      • Builds conversation processing chain
      • Connects various components
  3. Prompt

    • ChatPromptTemplate
      • Manages overall prompt structure
      • Combines multiple prompt components
    • MessagesPlaceholder
      • Handles history message insertion
    • SystemMessagePromptTemplate
      • Defines system role and behavior
    • HumanMessagePromptTemplate
      • Formats user input
  4. Tools

    • ChatMessageHistory
      • Stores conversation history
      • Supports message tracking
    • gr.ChatInterface
      • Provides Web interaction interface
      • Displays conversation content

Usage

python agents/1_simple_conversational_agent.py

Note: Requires OpenAI API key configuration

2. Reason Act Agent

Features

  1. Intelligent agent implemented based on ReAct (Reasoning and Acting) paradigm
  2. Possesses dual capabilities of reasoning and acting
  3. Integrates search tools for real-time information
  4. Supports multi-turn dialogue and continuous reasoning

Core Components

  1. Large Language Model and Interface

    • OpenAI API interface encapsulation
    • ChatOpenAI model integrated with LangChain
  2. Prompt System

    • Uses standard ReAct prompt templates from LangChain Hub
    • Supports structured reasoning and action instructions
  3. Agent Framework

    • ReAct Agent implementation
    • AgentExecutor
    • Supports thought-action-observation loop
  4. Tool Integration

    • SerpAPIWrapper search tool
    • Extensible tool registration mechanism
    • Supports dynamic tool invocation
python agents/2_reason_act.py

3. Function Calling Agent

Features

  1. Text Summarization: Uses GPT model for intelligent text summarization
  2. Chinese Translation: Automatically translates English text to Chinese
  3. Tool Chain Combination: Implements multi-functionality integration through function calls
  4. Automated Processing: Agent can automatically decide on appropriate tools for task completion

Core Components

  1. LLM

    • ChatOpenAI
      • OpenAI GPT model interface
      • Supports function calling capabilities
  2. Prompt

    • PromptTemplate
      • Defines prompt templates for agent behavior
      • Used for summarization, translation, and agent instructions
  3. Agent

    • create_tool_calling_agent
      • Creates function-calling capable agent
      • Built based on tools and prompt templates
    • AgentExecutor
      • Responsible for executing agent tasks
      • Coordinates tool calling process
  4. Tools

    • StructuredTool
      • Wraps functions as structured tools
    • BaseModel (Pydantic)
      • Defines tool input parameter schema
    • Field
      • Adds description information for tool parameters
  5. UI

    • gradio.ChatInterface
      • Provides interactive chat interface
      • Supports conversation history
python agents/3_function_calling_agent.py

4. Reasoning with O1 Agent

Features

  1. Multimodal Analysis: Support for joint analysis of images and text
  2. Org Structure Parsing: Specialized in parsing and understanding organizational charts
  3. JSON Structured Output: Convert analysis results into standardized JSON format
  4. Interactive Interface: Support for image upload and multi-turn dialogue

Core Components

  1. LLM

    • Model Configuration
      • GPT-4O-Mini: Text processing model
      • O1: Vision-language model
    • o1_vision
      • Supports multimodal input of text and images
      • Optional JSON format output
      • Base64-based image encoding
      • Supports custom prompts
  2. Prompt

    • structured_prompt
      • Defines structured instruction templates
      • Guides model for org structure analysis
      • Standardizes JSON output format
  3. Agent

    • process_message
      • Handles multimodal input
      • Coordinates image and text analysis
      • Generates structured responses
  4. UI

    • gr.ChatInterface
      • Supports multimodal input interface
      • Allows multiple file uploads
      • Displays interactive dialogue

Usage

python agents/4_reasoning_with_o1.py

Note: Requires OpenAI API key and O1 model access permissions

Contributing

We welcome all forms of contributions! Please check the contribution guidelines for more information.

License

MIT License - See LICENSE file for details

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages