-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add act evals on stagehand.page
#328
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved pending act evals pass
act evals passed, merging |
* Use CI on v2 branch * branch * add docs, move scoring functions to scoring.ts, move experiment naming to utils.ts * add initStagehand.ts * break up index.evals.ts and utils into smaller files * export LogLineEval * typing * follow StagehandConfig pattern * choose api key based on model name * stagehand.act -> page.act (#326) * need to actually move to act to page now * move act -> page * fix e2e * fix tests * readme * changeset * package json and changeset * don't fail on combo evals * Add act evals on `stagehand.page` (#328) * move act evals to stagehand.page * add basic act and make act necessary in type * move extract and observe to page (#329) * move act evals to stagehand.page * add basic act and make act necessary in type * move extract and observe * example * changeset * More playwright tests (#330) * add docs, move scoring functions to scoring.ts, move experiment naming to utils.ts * add initStagehand.ts * break up index.evals.ts and utils into smaller files * export LogLineEval * typing * follow StagehandConfig pattern * choose api key based on model name * Use CI on v2 branch * branch * stagehand.page tests * dont run on BB * prettier * pls dont fail * headless --------- Co-authored-by: Anirudh Kamath <[email protected]> * add extract evals for stagehand.page (#331) * add extract evals for stagehand.page * fix typign * smh i didn't actually run extract * add observe page evals (#332) * change stagehand.observe to stagehand.page.observe in evals * changeset * Browsercontext playwright tests (#334) * add docs, move scoring functions to scoring.ts, move experiment naming to utils.ts * add initStagehand.ts * break up index.evals.ts and utils into smaller files * export LogLineEval * typing * follow StagehandConfig pattern * choose api key based on model name * Use CI on v2 branch * branch * BrowserContext tests * file path --------- Co-authored-by: Anirudh Kamath <[email protected]> * changeset minor * ci yml --------- Co-authored-by: seanmcguire12 <[email protected]> Co-authored-by: Sean McGuire <[email protected]>
why
#326 moved
act
fromstagehand
tostagehand.page
and deprecatedstagehand.act
. This changes evals to account for this change.what changed
Evals now point to
stagehand.page.act
instead ofstagehand.act
test plan
evals