Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: give kinda helpful message if too many open files #1110

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/source/configurable.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Let's take a look at the core config.
parallel_attempts: false
lite: true
show_z: false
enable_experimental: false
max_workers: 500

run:
seed:
Expand Down Expand Up @@ -93,6 +95,7 @@ such as ``show_100_pass_modules``.
* ``narrow_output`` - Support output on narrower CLIs
* ``show_z`` - Display Z-scores and visual indicators on CLI. It's good, but may be too much info until one has seen garak run a couple of times
* ``enable_experimental`` - Enable experimental function CLI flags. Disabled by default. Experimental functions may disrupt your installation and provide unusual/unstable results. Can only be set by editing core config, so a git checkout of garak is recommended for this.
* ``max_workers`` - Cap on how many parallel workers can be requested. When raising this in order to use higher parallelisation, keep an eye on system resources (e.g. `ulimit -n 4026` on Linux)

``run`` config items
""""""""""""""""""""
Expand Down
20 changes: 16 additions & 4 deletions garak/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,16 @@ def main(arguments=None) -> None:

import argparse

def worker_count_validation(workers):
iworkers = int(workers)
if iworkers <= 0:
raise argparse.ArgumentTypeError("Need >0 workers (int)" % workers)
if iworkers > _config.system.max_workers:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing show this is not really configurable as _config has not yet loaded garak.site.yaml and also has not loaded --config supplied file if passed on cli.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thanks, will amend and I guess write a test for

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, without undue gymnastics, we can either:

  1. have a hardcoded, non-configurable cap on CLI param validation (in where? garak.cli isn't universal, but otoh argparse only applies within this module; top-level values in garak._config is the opposite of the direction we're trying to move in)
  2. drop CLI max_workers validation but let the config-based one be enforced

Having params that are only configurable in garak.core.yaml isn't a good option

wdyt? are there other options that make sense? currently leaning toward (2)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do the cli provided validation call after config is loaded and before starting to instantiate things say around here?

garak/garak/cli.py

Lines 369 to 371 in 27d4554

# base config complete

raise argparse.ArgumentTypeError(
"Parallel worker count capped at %s (config.system.max_workers)" % _config.system.max_workers
)
return iworkers

parser = argparse.ArgumentParser(
prog="python -m garak",
description="LLM safety & security scanning tool",
Expand Down Expand Up @@ -92,15 +102,15 @@ def main(arguments=None) -> None:
)
parser.add_argument(
"--parallel_requests",
type=int,
type=worker_count_validation,
default=_config.system.parallel_requests,
help="How many generator requests to launch in parallel for a given prompt. Ignored for models that support multiple generations per call.",
)
parser.add_argument(
"--parallel_attempts",
type=int,
type=worker_count_validation,
default=_config.system.parallel_attempts,
help="How many probe attempts to launch in parallel.",
help="How many probe attempts to launch in parallel. Raise this for faster runs when using non-local models.",
)
parser.add_argument(
"--skip_unknown",
Expand Down Expand Up @@ -484,7 +494,9 @@ def main(arguments=None) -> None:
if has_changes:
exit(1) # exit with error code to denote changes
else:
print("No revisions applied. Please verify options provided for `--fix`")
print(
"No revisions applied. Please verify options provided for `--fix`"
)
elif args.report:
from garak.report import Report

Expand Down
29 changes: 22 additions & 7 deletions garak/generators/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

from garak import _config
from garak.configurable import Configurable
from garak.exception import GarakException
import garak.resources.theme


Expand Down Expand Up @@ -162,13 +163,27 @@ def generate(
)
multi_generator_bar.set_description(self.fullname[:55])

with Pool(_config.system.parallel_requests) as pool:
for result in pool.imap_unordered(
self._call_model, [prompt] * generations_this_call
):
self._verify_model_result(result)
outputs.append(result[0])
multi_generator_bar.update(1)
pool_size = min(
generations_this_call,
_config.system.parallel_requests,
_config.system.max_workers,
)
Comment on lines +166 to +170
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct access to config here suggest we should have a helper that owns process or thread pools that references _config.system.


try:
with Pool(pool_size) as pool:
for result in pool.imap_unordered(
self._call_model, [prompt] * generations_this_call
):
self._verify_model_result(result)
outputs.append(result[0])
multi_generator_bar.update(1)
except OSError as o:
if o.errno == 24:
msg = "Parallelisation limit hit. Try reducing parallel_requests or raising limit (e.g. ulimit -n 4096)"
logging.critical(msg)
raise GarakException(msg) from o
else:
raise (o)

else:
generation_iterator = tqdm.tqdm(
Expand Down
38 changes: 26 additions & 12 deletions garak/probes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

from garak import _config
from garak.configurable import Configurable
from garak.exception import PluginConfigurationError
from garak.exception import GarakException
import garak.attempt
import garak.resources.theme

Expand Down Expand Up @@ -178,17 +178,31 @@ def _execute_all(self, attempts) -> Iterable[garak.attempt.Attempt]:
attempt_bar = tqdm.tqdm(total=len(attempts), leave=False)
attempt_bar.set_description(self.probename.replace("garak.", ""))

with Pool(_config.system.parallel_attempts) as attempt_pool:
for result in attempt_pool.imap_unordered(
self._execute_attempt, attempts
):
_config.transient.reportfile.write(
json.dumps(result.as_dict()) + "\n"
)
attempts_completed.append(
result
) # these will be out of original order
attempt_bar.update(1)
pool_size = min(
len(attempts),
_config.system.parallel_attempts,
_config.system.max_workers,
)

try:
with Pool(pool_size) as attempt_pool:
for result in attempt_pool.imap_unordered(
self._execute_attempt, attempts
):
_config.transient.reportfile.write(
json.dumps(result.as_dict()) + "\n"
)
attempts_completed.append(
result
) # these will be out of original order
attempt_bar.update(1)
except OSError as o:
if o.errno == 24:
msg = "Parallelisation limit hit. Try reducing parallel_attempts or raising limit (e.g. ulimit -n 4096)"
logging.critical(msg)
raise GarakException(msg) from o
else:
raise (o)

else:
attempt_iterator = tqdm.tqdm(attempts, leave=False)
Expand Down
1 change: 1 addition & 0 deletions garak/resources/garak.core.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ system:
lite: true
show_z: false
enable_experimental: false
max_workers: 500

run:
seed:
Expand Down