Skip to content

helixml/testing-genai

Repository files navigation

Testing & CI for GenAI Applications

Build and test AI applications with confidence. This repository contains examples and starter code from our workshop on implementing reliable testing practices for generative AI systems.

  • Workshop recording that you can follow along yourself: video
  • Full workshop documentation: doc

What's Inside

The code demonstrates three key patterns for testing AI applications:

  1. comedian.yaml shows how to test for consistent AI personality and behavior. We use a simple comedian bot to demonstrate how automated testing can drive prompt engineering.

  2. hr-docs.yaml demonstrates testing document Q&A capabilities by connecting to a knowledge base of HR policies and verifying response accuracy. First upload this file to a new 'hr-docs' folder in the ... menu and then Files.

  3. exchange-rates.yaml showcases API integration testing, ensuring proper handling of user inputs like currency pairs and focused, relevant responses.

Quick Start

# Install Helix CLI
curl https://deploy.helix.ml/install.sh | bash

# Deploy an app
helix apply -f comedian.yaml

# Run tests
helix test -f comedian.yaml

Setting Up CI

The repository includes GitHub Actions and GitLab CI configurations to automatically test your AI applications on every commit. See .github/workflows/helix.yml for a complete example of:

  • Running tests on pull requests
  • Automated deployment on merge to main
  • Test report generation and PR comments

Learn More

Questions? Join us on Discord or the MLOps Community Slack #helix channel.

About

Repo for the testing-genai workshop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages