A demo voice application that acts as a virtual receptionist for a healthcare clinic, powered by OpenAI's Realtime API.
- Real-time Voice Conversation: Uses OpenAI's Realtime API for natural voice interactions
- Virtual Clinic Receptionist: Handles appointment scheduling, doctor availability, and general clinic inquiries
- Function Calling: Integrates with backend functions for appointment scheduling and doctor lookups
- Audio File Testing: Comprehensive test suite that can process audio files and capture responses
- Professional Healthcare Context: Designed specifically for medical receptionist scenarios with appropriate guardrails
- Python 3.12+
- OpenAI API key with Realtime API access
- Poetry for dependency management
-
Clone and setup the project:
git clone <repository-url> cd python-demo-voice-app
-
Install dependencies:
poetry install
-
Set up your environment:
cp env.example .env # Edit .env and add your OpenAI API key
-
Run the voice app:
poetry run python -m python_demo_voice_app.cli run
The app uses environment variables for configuration. Copy env.example
to .env
and configure:
# Required
OPENAI_API_KEY=your_openai_api_key_here
# Optional
OPENAI_ORG_ID=your_org_id_here
You can view the current configuration with:
poetry run python -m python_demo_voice_app.cli config
Start the virtual receptionist:
poetry run python -m python_demo_voice_app.cli run
The app will:
- Connect to OpenAI's Realtime API
- Configure the session as a virtual clinic receptionist
- Listen for audio input and respond naturally
- Handle function calls for appointments and doctor availability
Run the complete test suite:
poetry run python -m python_demo_voice_app.cli test
Test with a specific audio file:
poetry run python -m python_demo_voice_app.cli test-audio path/to/audio.wav
The test suite will:
- Send synthetic or real audio to the API
- Capture the AI's audio responses
- Save complete conversations as WAV files
- Generate individual scenario recordings
- Provide detailed conversation summaries
The AI receptionist can handle:
- Check doctor availability
- Schedule appointments with specific doctors
- Collect patient information (name, phone, reason for visit)
- Provide confirmation numbers
- Clinic hours and location information
- Available services and specialties
- Doctor information and specialties
- Insurance and billing questions
- Recognize urgent medical situations
- Direct patients to emergency services when appropriate
- Maintain professional medical boundaries
- Dr. Sarah Smith - Family Medicine
- Dr. Michael Johnson - Internal Medicine
- Dr. Emily Williams - Pediatrics
python_demo_voice_app/
├── __init__.py # Package initialization
├── config.py # Configuration and settings
├── receptionist.py # Virtual receptionist logic
├── voice_client.py # WebSocket client for Realtime API
├── main.py # Main application entry point
└── cli.py # Command-line interface
tests/
├── __init__.py
└── test_voice_app.py # Comprehensive test suite
pyproject.toml # Poetry configuration
env.example # Environment variable template
README.md # This file
Handles the conversation logic and function calls:
- System instructions for medical receptionist behavior
- Function definitions for appointments and doctor lookup
- Mock data for doctors and services
Manages the WebSocket connection to OpenAI:
- Session configuration
- Audio streaming (send/receive)
- Message handling and function call processing
Comprehensive testing framework:
- Audio file processing and format conversion
- Conversation scenario testing
- Response capture and analysis
The app handles audio in PCM 16-bit format at 24kHz (required by OpenAI Realtime API):
- Input: Accepts various audio formats via file upload, converts to required format using pydub
- Output: Receives and captures PCM audio chunks from the AI via WebSocket
- Testing: Can generate synthetic audio or process real audio files
- Recording: Saves complete conversations as WAV files for analysis and demonstration
The app creates comprehensive audio recordings in the test_audio/
directory:
- Full conversations - Complete back-and-forth interactions
- Individual scenarios - Separate files for each test case
- Response analysis - Captured AI responses for evaluation
The receptionist can call these functions:
Parameters:
patient_name
: Patient's full namedoctor
: Requested doctortime_slot
: Requested appointment timereason
: Reason for the visitphone
: Patient's contact number
Parameters:
specialty
: Optional medical specialty filter
- New Functions: Add to
receptionist.py
in theget_conversation_config()
method - Function Handlers: Implement in
receptionist.py
handle_function_call()
method - Tests: Add scenarios to
test_voice_app.py
The app uses Python's logging module:
- INFO: General application flow
- DEBUG: Detailed WebSocket and API interactions
- ERROR: Error conditions and failures
Logs are written to both console and voice_app.log
.
-
Missing OpenAI API Key
Error: OPENAI_API_KEY environment variable is required
Solution: Set your API key in the
.env
file -
Connection Issues
Error: Failed to connect to WebSocket
Solution: Check your internet connection and API key validity
-
Audio Format Issues
Error loading audio file
Solution: Ensure audio files are in supported formats (WAV, MP3, etc.)
Run with verbose logging:
poetry run python -m python_demo_voice_app.cli run -v
- API Keys: Never commit API keys to version control
- Medical Information: This is a demo app - don't use for real patient data
- Function Calls: Validate all function inputs in production
- Audio Storage: Be mindful of audio data privacy
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
This project is for demonstration purposes. Please ensure compliance with healthcare regulations if adapting for real medical use.
For issues and questions:
- Check the troubleshooting section
- Review the logs for error details
- Ensure your OpenAI API key has Realtime API access
- Verify all dependencies are installed correctly