v0.1.5: Tabular Data Node, Evaluation Output #67
ianarawjo
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We've added Tabular Data to ChainForge, to help conduct ground truth evaluations. Full release notes below.
Tabular Data Nodes 🗂️
You can now input and import tabular data (spreadsheets) into ChainForge. Accepted formats are
jsonl
,xlsx
, andcsv
. Excel and CSV files must have a header row with column names.Tabular data provides an easy way to enter associated prompt parameters or import existing datasets and benchmarks. A typical use case is ground truth evaluation, where we have some inputs to a prompt, and an "ideal" or expected answer:
Here, we see variables
{first}
,{last}
, and{invention}
"carry together" when filling the prompt template: ChainForge knows they are all associated with one another, connected via the row. Thus, it constructs 4 prompts from the input parameters.Accessing tabular data, even if it's not input into the prompt directly
Alongside tabular data is a new property of
response
objects in Evaluation nodes: themeta
dict. This allows you to get access to column data that is associated with inputs to a prompt template, but was not itself directly input into the prompt template. For instance, in the new example flow for ground truth evaluation of math problems:Notice the evaluator uses
meta
to get "Expected", which is associated with the prompt input variablequestion
by virtue of it being on the same row of the table.Example flows
Tabular data allows us to run many more types of LLM evaluations. For instance, here is the ground truth evaluation
multistep-word-problems
from OpenAI evals, loaded into ChainForge:We've added an Example Flow for ground truth evaluation that provides a good starting point.
Evaluation Node output 📟
Curious what the format of a
response
object is like? You can nowprint
insideevaluate
functions to print output directly to the browser:In addition, Exceptions raised inside your evaluation function will also print to the node out:
Slight styling improvements in Response Inspectors
We removed the use of blue Badges to display unselected prompt variable and replaced them with text that blends into the background:
The fullscreen inspector also displays slightly larger font size for readability:
Final thoughts / comments
<textarea>
is an element of a table cell. We are working on a fix so columns are automatically sized based on their content.Want to see a feature / have a comment? Start a Discussion or submit an Issue!
This discussion was created from the release v0.1.5: Tabular Data Node, Evaluation Output.
Beta Was this translation helpful? Give feedback.
All reactions