Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable TAP probe, refactor judge resources to shared red team #949

Merged

Conversation

jmartin-tech
Copy link
Collaborator

  • resources for red team type activities as a shared lib
    • Model-as-a-judge logic can apply to many techniques
  • refactor TAP probe to use Configurable
    • updated probe to accept generic model config
    • set default to cpu for hf model in tap generation
    • update util function to allow any valid generator
    • warn when generator is not known supported config
    • remove unused max_token params

Verification

List the steps needed to make sure this thing works

config.file.yaml:

plugins:
  generators:
    huggingface:
      hf_args:
        device: cpu
  probes:
    tap:
      attack_model_config:
        max_tokens: 500
        hf_args:
          device: cpu
      evaluator_model_type: nim
      evaluator_model_name: meta/llama3-8b-instruct
      evaluator_model_config:
        uri: http://0.0.0.0:8000/v1
        api_key: fake
  • garak -m huggingface -n meta-llama/Llama-2-7b-chat-hf --config config.file.yaml --probes tap.TAP
  • Run the tests and ensure they pass python -m pytest tests/

* resources for `red team` type activities as a shared lib
  * Model-as-a-judge logic can apply to many techniques
* refactor TAP probe to use `Configurable`
  * updated probe to accept generic model config
  * set default to `cpu` for hf model in tap generation
  * update util function to allow any valid generator
  * warn when generator is not known supported config
  * remove unused max_token params
@leondz
Copy link
Owner

leondz commented Oct 11, 2024

Thanks!! Before reviewing - is the scope a full LLMaaJ implementation or a first phase?

@jmartin-tech
Copy link
Collaborator Author

This is just foundation that will aid code reuse in the detector implementation, the PR enables the TAP probe to accept model configuration in it's existing usage of LLMaaJ.

* match project expectations for `data` to be static fixtures
* free up `tests/resources` path for adding python code tests for that package

Signed-off-by: Jeffrey Martin <[email protected]>
@jmartin-tech jmartin-tech force-pushed the task/refactor-tap-as-configurable branch 3 times, most recently from 02755bf to 4d1d39a Compare October 14, 2024 22:04
@jmartin-tech jmartin-tech force-pushed the task/refactor-tap-as-configurable branch from 4d1d39a to e4d4a5c Compare October 14, 2024 22:11
Copy link
Owner

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments re: tests, constants, deps; generally lgtm

garak/probes/tap.py Outdated Show resolved Hide resolved
garak/resources/red_team/evaluation.py Outdated Show resolved Hide resolved
garak/resources/red_team/evaluation.py Outdated Show resolved Hide resolved
garak/resources/red_team/evaluation.py Show resolved Hide resolved
garak/resources/red_team/conversation.py Outdated Show resolved Hide resolved
garak/resources/red_team/evaluation.py Show resolved Hide resolved
garak/resources/tap/tap_main.py Outdated Show resolved Hide resolved
tests/detectors/test_detectors_fileformats.py Show resolved Hide resolved
garak/resources/red_team/evaluation.py Show resolved Hide resolved
jmartin-tech and others added 3 commits October 17, 2024 15:13
* set TAPCached default filename param to a relative path
* enforce relative path is in `data_path` searched locations
* print message from exception on detector load failure in pxd harness

Signed-off-by: Jeffrey Martin <[email protected]>
@leondz
Copy link
Owner

leondz commented Oct 18, 2024

lgtm

@jmartin-tech jmartin-tech merged commit a5be6eb into leondz:main Oct 18, 2024
8 checks passed
@jmartin-tech jmartin-tech deleted the task/refactor-tap-as-configurable branch October 18, 2024 13:59
@github-actions github-actions bot locked and limited conversation to collaborators Oct 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants