Build and test AI applications with confidence. This repository contains examples and starter code from our workshop on implementing reliable testing practices for generative AI systems.
The code demonstrates three key patterns for testing AI applications:
-
comedian.yaml
shows how to test for consistent AI personality and behavior. We use a simple comedian bot to demonstrate how automated testing can drive prompt engineering. -
hr-docs.yaml
demonstrates testing document Q&A capabilities by connecting to a knowledge base of HR policies and verifying response accuracy. First upload this file to a new 'hr-docs' folder in the ... menu and then Files. -
exchange-rates.yaml
showcases API integration testing, ensuring proper handling of user inputs like currency pairs and focused, relevant responses.
# Install Helix CLI
curl https://deploy.helix.ml/install.sh | bash
# Deploy an app
helix apply -f comedian.yaml
# Run tests
helix test -f comedian.yaml
The repository includes GitHub Actions and GitLab CI configurations to automatically test your AI applications on every commit. See .github/workflows/helix.yml
for a complete example of:
- Running tests on pull requests
- Automated deployment on merge to main
- Test report generation and PR comments
- Watch the workshop recording: video
- Join our next workshop: registration link
- Try Helix: https://deploy.helix.ml/
Questions? Join us on Discord or the MLOps Community Slack #helix channel.