-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable TAP probe, refactor judge resources to shared red team #949
Configurable TAP probe, refactor judge resources to shared red team #949
Conversation
* resources for `red team` type activities as a shared lib * Model-as-a-judge logic can apply to many techniques * refactor TAP probe to use `Configurable` * updated probe to accept generic model config * set default to `cpu` for hf model in tap generation * update util function to allow any valid generator * warn when generator is not known supported config * remove unused max_token params
Thanks!! Before reviewing - is the scope a full LLMaaJ implementation or a first phase? |
This is just foundation that will aid code reuse in the detector implementation, the PR enables the TAP probe to accept model configuration in it's existing usage of LLMaaJ. |
* match project expectations for `data` to be static fixtures * free up `tests/resources` path for adding python code tests for that package Signed-off-by: Jeffrey Martin <[email protected]>
Signed-off-by: Jeffrey Martin <[email protected]>
02755bf
to
4d1d39a
Compare
Signed-off-by: Jeffrey Martin <[email protected]>
4d1d39a
to
e4d4a5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments re: tests, constants, deps; generally lgtm
Signed-off-by: Jeffrey Martin <[email protected]>
* set TAPCached default filename param to a relative path * enforce relative path is in `data_path` searched locations * print message from exception on detector load failure in pxd harness Signed-off-by: Jeffrey Martin <[email protected]>
lgtm |
red team
type activities as a shared libConfigurable
cpu
for hf model in tap generationVerification
List the steps needed to make sure this thing works
config.file.yaml:
garak -m huggingface -n meta-llama/Llama-2-7b-chat-hf --config config.file.yaml --probes tap.TAP
python -m pytest tests/