Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the TicTacToe environment #1192

Merged
merged 27 commits into from
May 3, 2024

Commits on Mar 9, 2024

  1. Simplify TicTacToe reward accumulation

    Rewards are only set once, only need to accumulate them once.
    No need to modify the accumulated rewards if they ae not set.
    dm-ackerman committed Mar 9, 2024
    Configuration menu
    Copy the full SHA
    b12f6ba View commit details
    Browse the repository at this point in the history
  2. Don't update TicTacToe agent on winning step

    For unknown reasons, this is required for stable baselines
    training to work correctly. It does not appear to impact other
    usage as the agents are still looped over correctly after the
    game ends - just in a different order.
    dm-ackerman committed Mar 9, 2024
    Configuration menu
    Copy the full SHA
    24f7569 View commit details
    Browse the repository at this point in the history
  3. Move TicTacToe test for valid moves to Board

    This makes the env code less cluttered and better encapsulates
    the board behavior. It also expands the checks for a valid move.
    dm-ackerman committed Mar 9, 2024
    Configuration menu
    Copy the full SHA
    e3f9cc9 View commit details
    Browse the repository at this point in the history
  4. Hard code winning lines in TicTacToe board

    These never change. There is no reason to recalculate them constantly.
    dm-ackerman committed Mar 9, 2024
    Configuration menu
    Copy the full SHA
    6708fdb View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ea15ddf View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    967caae View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2024

  1. Configuration menu
    Copy the full SHA
    e6851a1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    42794bc View commit details
    Browse the repository at this point in the history
  3. Update TicTacToe winning code

    removes the duplicate calls to check winner
    dm-ackerman committed Mar 11, 2024
    Configuration menu
    Copy the full SHA
    2847c17 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    eda5033 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    364c307 View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Add legal_moves() to TicTacToe board

    This keeps the env from needing to access the internals
    of the board to get the moves available.
    dm-ackerman committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    758e6a2 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2024

  1. Remove win detection short-cut in TicTacToe

    This causes problems with test cases that have invalid
    board configurations to test win lines.
    dm-ackerman committed Mar 20, 2024
    Configuration menu
    Copy the full SHA
    8b7ce5e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5bebafa View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8e33bf7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    df4c950 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f750754 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    72ef62f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    e4bd228 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    4f35a22 View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2024

  1. Configuration menu
    Copy the full SHA
    1f95299 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    39de495 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6336393 View commit details
    Browse the repository at this point in the history

Commits on May 3, 2024

  1. Fix agent swap in TicTacToe

    This now always switches agents in each step. The previous change was a
    bug that "fixed" the behaviour due to a bug in the SB3 tutorial.
    
    The agent should be swapped every step as is done now.
    dm-ackerman committed May 3, 2024
    Configuration menu
    Copy the full SHA
    71ca217 View commit details
    Browse the repository at this point in the history
  2. revert TicTacToe version to 3

    The change to v4 was due to the agent handling at the end of an episode
    
    The other changes to the env don't change the behaviour of the env, so
    it is left at v3.
    dm-ackerman committed May 3, 2024
    Configuration menu
    Copy the full SHA
    4170f2b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    da8373e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    44716c3 View commit details
    Browse the repository at this point in the history