-
-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the TicTacToe environment #1192
Commits on Mar 9, 2024
-
Simplify TicTacToe reward accumulation
Rewards are only set once, only need to accumulate them once. No need to modify the accumulated rewards if they ae not set.
Configuration menu - View commit details
-
Copy full SHA for b12f6ba - Browse repository at this point
Copy the full SHA b12f6baView commit details -
Don't update TicTacToe agent on winning step
For unknown reasons, this is required for stable baselines training to work correctly. It does not appear to impact other usage as the agents are still looped over correctly after the game ends - just in a different order.
Configuration menu - View commit details
-
Copy full SHA for 24f7569 - Browse repository at this point
Copy the full SHA 24f7569View commit details -
Move TicTacToe test for valid moves to Board
This makes the env code less cluttered and better encapsulates the board behavior. It also expands the checks for a valid move.
Configuration menu - View commit details
-
Copy full SHA for e3f9cc9 - Browse repository at this point
Copy the full SHA e3f9cc9View commit details -
Hard code winning lines in TicTacToe board
These never change. There is no reason to recalculate them constantly.
Configuration menu - View commit details
-
Copy full SHA for 6708fdb - Browse repository at this point
Copy the full SHA 6708fdbView commit details -
Configuration menu - View commit details
-
Copy full SHA for ea15ddf - Browse repository at this point
Copy the full SHA ea15ddfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 967caae - Browse repository at this point
Copy the full SHA 967caaeView commit details
Commits on Mar 11, 2024
-
Configuration menu - View commit details
-
Copy full SHA for e6851a1 - Browse repository at this point
Copy the full SHA e6851a1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 42794bc - Browse repository at this point
Copy the full SHA 42794bcView commit details -
removes the duplicate calls to check winner
Configuration menu - View commit details
-
Copy full SHA for 2847c17 - Browse repository at this point
Copy the full SHA 2847c17View commit details -
Configuration menu - View commit details
-
Copy full SHA for eda5033 - Browse repository at this point
Copy the full SHA eda5033View commit details -
Configuration menu - View commit details
-
Copy full SHA for 364c307 - Browse repository at this point
Copy the full SHA 364c307View commit details
Commits on Mar 13, 2024
-
Add legal_moves() to TicTacToe board
This keeps the env from needing to access the internals of the board to get the moves available.
Configuration menu - View commit details
-
Copy full SHA for 758e6a2 - Browse repository at this point
Copy the full SHA 758e6a2View commit details
Commits on Mar 20, 2024
-
Remove win detection short-cut in TicTacToe
This causes problems with test cases that have invalid board configurations to test win lines.
Configuration menu - View commit details
-
Copy full SHA for 8b7ce5e - Browse repository at this point
Copy the full SHA 8b7ce5eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5bebafa - Browse repository at this point
Copy the full SHA 5bebafaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e33bf7 - Browse repository at this point
Copy the full SHA 8e33bf7View commit details -
Configuration menu - View commit details
-
Copy full SHA for df4c950 - Browse repository at this point
Copy the full SHA df4c950View commit details -
Configuration menu - View commit details
-
Copy full SHA for f750754 - Browse repository at this point
Copy the full SHA f750754View commit details -
Configuration menu - View commit details
-
Copy full SHA for 72ef62f - Browse repository at this point
Copy the full SHA 72ef62fView commit details -
Configuration menu - View commit details
-
Copy full SHA for e4bd228 - Browse repository at this point
Copy the full SHA e4bd228View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f35a22 - Browse repository at this point
Copy the full SHA 4f35a22View commit details
Commits on Mar 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1f95299 - Browse repository at this point
Copy the full SHA 1f95299View commit details -
Configuration menu - View commit details
-
Copy full SHA for 39de495 - Browse repository at this point
Copy the full SHA 39de495View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6336393 - Browse repository at this point
Copy the full SHA 6336393View commit details
Commits on May 3, 2024
-
This now always switches agents in each step. The previous change was a bug that "fixed" the behaviour due to a bug in the SB3 tutorial. The agent should be swapped every step as is done now.
Configuration menu - View commit details
-
Copy full SHA for 71ca217 - Browse repository at this point
Copy the full SHA 71ca217View commit details -
The change to v4 was due to the agent handling at the end of an episode The other changes to the env don't change the behaviour of the env, so it is left at v3.
Configuration menu - View commit details
-
Copy full SHA for 4170f2b - Browse repository at this point
Copy the full SHA 4170f2bView commit details -
Configuration menu - View commit details
-
Copy full SHA for da8373e - Browse repository at this point
Copy the full SHA da8373eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 44716c3 - Browse repository at this point
Copy the full SHA 44716c3View commit details