-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BC] Add an interactive only mode for env.py #377
Changes from 22 commits
a06d452
15876f7
9bc8c05
86e4d12
27dee69
47f5efc
bf12cee
f5b6b6f
35d9e8c
5d6783d
7997f14
59d3677
6d8c0c7
3b0cefd
78460ce
5b5d67b
6be7186
6342dda
3082ae7
2e26243
56fa72a
463d813
5568aaf
0c20688
ac307fc
07a77ce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -215,6 +215,32 @@ def test_env(self, mock_popen): | |
step = test_env.step(np.array([1], dtype=np.int64)) | ||
self.assertEqual(step.step_type, env.StepType.LAST) | ||
|
||
@mock.patch('subprocess.Popen') | ||
def test_env_interactive_only(self, mock_popen): | ||
mock_popen.side_effect = mock_interactive_clang | ||
|
||
test_env = env.MLGOEnvironmentBase( | ||
clang_path=_CLANG_PATH, | ||
task_type=MockTask, | ||
obs_spec={}, | ||
action_spec={}, | ||
interactive_only=True, | ||
) | ||
|
||
for env_itr in range(3): | ||
del env_itr | ||
step = test_env.reset(_MOCK_MODULE) | ||
self.assertEqual(step.step_type, env.StepType.FIRST) | ||
|
||
for step_itr in range(_NUM_STEPS - 1): | ||
del step_itr | ||
step = test_env.step(np.array([1], dtype=np.int64)) | ||
self.assertEqual(step.step_type, env.StepType.MID) | ||
|
||
step = test_env.step(np.array([1], dtype=np.int64)) | ||
self.assertEqual(step.step_type, env.StepType.LAST) | ||
self.assertEqual(step.reward, {'default': 0.}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the only way we know it's interactive by seeing the reward be 0? Can we test that the 2 iclangs are the same, too? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can directly check if test_env._iclang and test_env._clang are the same object. |
||
|
||
|
||
if __name__ == '__main__': | ||
tf.test.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we return iclang here, i.e. not a tuple, and detect that on the receiving end? or (iclang, None)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relative reward is computed, that is, the compute_relative_rewards function of env.py always expects two scores. I found this to be the cleanest change. Returning only iclang and None will end up being identical but would require multiple changes and I didn't want to break anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - can you update the documentation string explaining this, too? (i.e. it's not quite "
interactive_only: If set to true only iclang is yielded
" - looking at the docstring, the return isn't quite well specified actually, what the generator generates is what's interesting - could you basically make the doc "right"?)