Skip to content

Test and evaluate LLMs and model configurations, across all the scenarios that matter for your application

License

Notifications You must be signed in to change notification settings

empirical-run/empirical

Repository files navigation

Empirical

npm Discord

Empirical is the fastest way to test different LLMs and model configurations, across all the scenarios that matter for your application.

With Empirical, you can

Empirical-TS-demo-video.mp4

Usage

See all docs →

Empirical bundles together a test runner and a web app. These can be used through the CLI in your terminal window.

Empirical relies on a configuration file, typically located at empiricalrc.js which describes the test to run.

Start with a basic example

In this example, we will ask an LLM to extract entities from user messages and give us a structured JSON output. For example, "I'm Alice from Maryland" will become {name: 'Alice', location: 'Maryland'}.

Our test will succeed if the model outputs valid JSON.

  1. Use the CLI to create a sample configuration file called empiricalrc.js.

    npm init empiricalrun
    
    # For TypeScript
    npm init empiricalrun -- --using-ts
  2. Run the example dataset against the selected models.

    npx empiricalrun

    This step requires the OPENAI_API_KEY environment variable to authenticate with OpenAI. This execution will cost $0.0026, based on the selected models.

  3. Use the ui command to open the reporter web app and see side-by-side results.

    npx empiricalrun ui

Make it yours

Edit the empiricalrc.js file to make Empirical work for your use-case.

Contribution guide

See development docs.