feat: Add IFEval Environment #412

abukharin-nv · 2025-05-19T14:05:46Z

What does this PR do ?

Adds IFEval environment and script.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

uv pip install absl-py langdetect nltk==3.9.1 immutabledict \
&& uv run examples/run_grpo_ifeval.py \

Before your PR is "Ready for review"

Pre checks:

[Y] Make sure you read and followed Contributor guidelines
[N] Did you write any new necessary tests?
[N] Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
[N] Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Signed-off-by: abukharin-nv <[email protected]>

SahilJain314 · 2025-05-19T18:13:49Z

examples/configs/grpo_math_1B.yaml

@@ -58,8 +58,8 @@ policy:
  # training and logprob stages respectively.
  dynamic_batching:
    enabled: True
-    train_mb_tokens: ${mul:${policy.max_total_sequence_length}, ${policy.train_micro_batch_size}}
-    logprob_mb_tokens: ${mul:${policy.max_total_sequence_length}, ${policy.logprob_batch_size}}
+    train_mb_tokens: 2048


Is there a particular reason for this change?

SahilJain314 · 2025-05-19T18:14:09Z

examples/configs/grpo_math_1B.yaml

@@ -126,6 +126,8 @@ data:
 env:
  math:
    num_workers: 8
+  ifeval:


can this be moved to a derivative config?

SahilJain314 · 2025-05-19T18:15:36Z

nemo_rl/environments/ifeval_environment.py

+
+
+@ray.remote
+class HFVerifyWorker:


Suggested change

class HFVerifyWorker:

class IFVerifyWorker:

SahilJain314 · 2025-05-19T18:21:01Z

nemo_rl/environments/instruction_following/instructions_util.py

+
+download_punkt_tab()
+
+WORD_LIST = [


@terrykong do you have an opinion on making this sort of 'data' list be in a file that's not .py? It being in this file might make searching harder (i frequently do search with *.py filter).

I'm indifferent. Okay as data or in python

SahilJain314 · 2025-05-19T18:34:52Z

Can we add the requirements for ifeval

uv pip install absl-py langdetect nltk==3.9.1 immutabledict \
&& uv run examples/run_grpo_ifeval.py \

to the pyproject in a separate group? Then use the uv environment method to run the ifeval workers with the correct deps? It's for this purpose.

Signed-off-by: abukharin-nv <[email protected]>

SahilJain314 · 2025-05-19T21:18:28Z

pyproject.toml

@@ -67,6 +67,12 @@ test = [
    "pytest-timeout",
    "pytest-cov",
 ]
+ifeval = [


Can this be moved to the [project.optional-dependencies] (like vllm above)?

Then, when you want to run an ifeval environment/need it's deps, you should run that ray remote with a PY_EXECUTABLE that points to that optional dep. Like the following for VLLM:

specify environment with --extra ifeval
https://github.com/NVIDIA/NeMo-RL/blob/main/nemo_rl/distributed/virtual_cluster.py#L46

launch worker with this py_executable
https://github.com/NVIDIA/NeMo-RL/blob/main/nemo_rl/models/generation/vllm.py#L52
(your sub-workers will also need it. See ifeval_environment L147)

SahilJain314 · 2025-05-19T23:54:04Z

nemo_rl/environments/instruction_following/instructions_registry.py

I assume the original is from here? https://github.com/google-research/google-research/tree/master/instruction_following_eval

@terrykong how should we include this? I can see why we may want to paste/copy it here since the googleresearch github is huge, but we probably need to be careful license wise?

Let me check

OOC, @abukharin-nv could you depend on https://github.com/EleutherAI/lm-evaluation-harness instead? looks like they copy in the right stuff and they also have other eval

Great idea. I will do that

@terrykong looks like there may be the same issue copying from EleutherAI: https://github.com/EleutherAI/lm-evaluation-harness/blob/29ea6832cd913b055ec1d6962180c773e8a7ac88/lm_eval/tasks/ifeval/instructions_registry.py#L1

What do you think?

I think it's okay that it's copied since the google-research repo had a permissive license. If you think lm-eval-harness works for us, then it's probably a more convenient repo to depend on since we just need to "track" that one as a top level dependency.

Are there other things in lm-eval-harness that you think we'd eventually make use of? If not, let's just deal with the ifeval source

Signed-off-by: abukharin-nv <[email protected]>

terrykong · 2025-05-20T18:05:44Z

nemo_rl/environments/instruction_following/instructions_util.py

+            print("Max retries reached. Could not download punkt_tab.")
+
+
+download_punkt_tab()


is this multiprocessing safe? curious if several environments import module, will it cause issues?

parthchadha · 2025-05-19T21:35:50Z

examples/configs/grpo_math_1B.yaml

@@ -58,6 +58,8 @@ policy:
  # training and logprob stages respectively.
  dynamic_batching:
    enabled: True
+    # train_mb_tokens: 2048


Remove dead code?

parthchadha · 2025-05-20T18:16:06Z

nemo_rl/environments/instruction_following/instructions_registry.py

+}
+
+
+def conflict_make(conflicts):


Is ensure_bidirectional_conflict a better name?

parthchadha · 2025-05-20T18:17:23Z

nemo_rl/environments/ifeval_environment.py

+        results = []
+        for response, prompt, metadata in zip(pred_responses, prompts, metadata):
+            try:
+                # with _mute_output():


Suggested change

# with _mute_output():

parthchadha · 2025-05-20T18:17:52Z

nemo_rl/environments/ifeval_environment.py

+        chunked_prompts = chunk_list_to_workers(prompts, self.num_workers)
+        chunked_metadata = chunk_list_to_workers(metadata, self.num_workers)
+
+        # # Process each chunk in parallel


Suggested change

# # Process each chunk in parallel

# Process each chunk in parallel

terrykong

could you please add tests?

terrykong · 2025-05-27T22:17:57Z

nemo_rl/environments/instruction_following/instructions.py

+packages = ["absl-py", "langdetect", "nltk==3.9.1", "immutabledict"]
+# Run pip install for each package
+for package in packages:
+    subprocess.run(["python3", "-m", "pip", "install", package])


can you comment this stuff out? this isn't necessary in nemo-rl

Add IFEval Environment

b8d3bef

Signed-off-by: abukharin-nv <[email protected]>

abukharin-nv requested review from terrykong and gshennvm May 19, 2025 14:06

abukharin-nv assigned gshennvm May 19, 2025

abukharin-nv changed the title ~~Add IFEval Environment~~ feat: Add IFEval Environment May 19, 2025

SahilJain314 reviewed May 19, 2025

View reviewed changes

Fix OmegaConf Issue, clean up deps

ded64f4

Signed-off-by: abukharin-nv <[email protected]>

SahilJain314 reviewed May 19, 2025

View reviewed changes

Clean up uv stuff

4540c83

Signed-off-by: abukharin-nv <[email protected]>

terrykong reviewed May 20, 2025

View reviewed changes

parthchadha requested changes May 20, 2025

View reviewed changes

terrykong reviewed May 20, 2025

View reviewed changes

terrykong reviewed May 27, 2025

View reviewed changes

		print("Max retries reached. Could not download punkt_tab.")


		download_punkt_tab()

	# # Process each chunk in parallel
	# Process each chunk in parallel


		download_punkt_tab()

		WORD_LIST = [

		}


		def conflict_make(conflicts):

feat: Add IFEval Environment #412

Are you sure you want to change the base?

feat: Add IFEval Environment #412

Uh oh!

Conversation

abukharin-nv commented May 19, 2025

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SahilJain314 commented May 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

terrykong left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!