-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate string output/input to Turn
objects
#1089
base: main
Are you sure you want to change the base?
Conversation
…ntil Conversation is implemented)
…tch the general case too
@jmartin-tech failing tests pass locally. they seem to centre on a class that's removed. could this be a caching thing? |
Yes, test failures are a caching related issue. I will see about addressing and offering a fix here or in a separate PR. |
sorry about the extra churn in test json - went to match the pretty printing - PR gtg for review. |
40633a1
to
061dbcc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to bake this a bit more. I think we need a better abstraction layer between Attempt
and Turn
. Let me diagram it out.
@@ -13,6 +13,73 @@ | |||
roles = {"system", "user", "assistant"} | |||
|
|||
|
|||
class Turn(dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I have a system prompt, a user message, and an assistant response, is that one turn or three turns?
This is not super clear to me from the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn
covers message content
(cf. OpenAI chat API spec). Because of this, the example is three turns - one for the sysprompt, one for the user message, one for the response.
translated_prompt = response.text | ||
attempt.prompt = translated_prompt | ||
yield self._derive_new_attempt(attempt) | ||
|
||
def untransform(self, attempt: garak.attempt.Attempt) -> garak.attempt.Attempt: | ||
translator = Translator(self.api_key) | ||
outputs = attempt.outputs | ||
attempt.notes["original_responses"] = outputs | ||
attempt.notes["original_responses"] = [ | ||
turn.text for turn in outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels unintuitive to me -- we're, presumably, annotating the attempt.notes
for the attempt.prompt.text
with the original output, but shouldn't these both be encapsulated into a Turn
object? Maybe this is just because we're doing translation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLEASE SEND DIAGRAM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting case, I like it.
Reasoning:
- garak will only do detections on text parts of output
- this buff only affects the text part of a prompt
- because (2) we only need to store the text part of the prompt
Resolves #602
summary
garak prompts & outputs are all captured as
str
We can about more modalities than strings - scope encompasses image attachments, files, rich outputs. Though the Language part of LLM is requisite for garak scope to apply.
This PR abstracts the type sent to and received from generators away from
str
toTurn
. EachTurn()
instance corresponds to a turn in a conversation. Simple probes poseTurn()
to their target.Attempt
s manage a message history ofTurn()
items instead of strings.details
Turn
classTurn represents the content part of a query to or response from an LLM.
Turn
s have:text
: corresponds to what was previouslyprompt: str
parts
: a list of data, default []Turn
s should beNone
on when the model provided no response. If the response was returned but had e.g. no text component, use aTurn()
withtext==None
.This is only a part of a query sent to a target model.
Turn
s are agnostic to metadata about which role is uttering it. A full LLM query might be composed of multiple turns.Tests should all specifically use the
Turn
object.NB it's now preferred to copy turn text locally instead of manipulating it in place. e.g.
-- unless dealing things that need to edit to object, e.g. buffs
Updates to
Attempt
Attempt is a core place this change affects.
attempt.prompt
is now aTurn
; refer to what wasattempt.prompt
, withattempt.prompt.text
attempt.outputs
is now alist
ofTurn
sattempt.messages
is still a list of lists of dicts, but thecontent
part of the dicts must be aTurn
Attempt
s, where some methods will accept aList
ofstr
.Attempt
internals requires usingTurn
s.Turn
type changes
attempt
will need a changeoutputs
which is now alist
ofTurn
sprompt
which is now aTurn
generators.base.generate()
should take aTurn
and not a stringgenerators.base.generate()
should return a list ofTurn
sgenerate()
returns a list of str, and the magic in Attempt handles the mapping toTurn()
generators.base.generate()
as returning a list of strings OK? Is it sensible to postpone migrating this to a list ofTurn
s? Would appreciate your thoughtsVerification
python -m pytest tests/test_attempt.py