Mini Project 4: Tic Tac Toe!
In this homework, you will:
- Build an agent that plays regular 3x3 Tic Tac Toe.
- Build an agent that plays 5x5 Tic Tac Toe with random walls.
- Improve your agent as much as you can and participate in the class’s Tic Tac Toe tournament!
Code Organization
Synchronize your fork with our course’s repo. You will find the stencil here.
The code has the following directories and layout:
tic_tac_toe_stencilcontains the stencil for the Tic Tac Toe game. The most important file in the stencil (and one you should familiarize yourself with well) is board.rs which contains two type definitions:CellandBoard. You should specifically look at:- The four cases of
Cell:Emptyrepresents an empty cell where players may place an X or O,Wallrepresents a blocked cell where no players can place anything (this is only relevant for part 2), andXandOrepresent a cell where a player has put anxoroalready. board.moves()returns all available moves in that board, i.e., the locations of all empty cells.board.game_over()returns true if the board is complete and false if it is still on going.board.score()returns the score of the given board: 1 indicates thatXhas 3Xs in a row, 0 indicates a draw, and-1indicates thatOhas 3Os in a row.board.apply_move(location, player)applies a move by the given player at the given location. For example,board.apply_move((0, 0), Player::X)puts anXat the top left most corner. Hint: does this function take the board (i.e. self) by copy, move, or ref? What does that mean about the board?board.get_cells()returns all the cells in the board as a 2D vec. E.g.,board.get_cells()[1][1]returns theCellin the middle of the board, which for example you can then analyze using amatchstatement.
- The four cases of
- agents.rs contains a variety of test agents that we provide to you. Take a look at them for inspiration.
tic_tac_toe_3x3contains the stencil and tests for part 1, you should provide your solution intic_tac_toe_3x3/src/solution/agent.rs.tic_tac_toe_5x5contains the stencil and tests for part 2, you should provide your solution intic_tac_toe_3x3/src/solution/agent.rs.
We won’t tell you how to split the work between your teams or how to use git. We will check your git commit history to make sure all members of your team made equal contributions.
Make sure your submission for part 1 is on a branch called tictactoe3. For part 2, your submission should be on a branch called tictactoe5. Your tournament submission should be on a branch called tournament.
Part 1: 3x3 Regular Tic Tac Toe
We will start with the 3x3 classic tic tac toe game.
Execute the following command, which will allow you to test out the stencil and play a game against our random agent (an agent that makes moves randomly).
cd tic_tac_toe_3x3
cargo run --release -- --x manual --o random
This will allow you to play an interactive game of tic tac toe by typing in the cell you would like to make a move in. Play a couple of rounds to refresh your memory of the game and its rules.
You can also play as the O player:
cargo run --release -- --x random --o manual
You can find a listing of all available agents by running:
cargo run --release -- --help
You will notice that one of the available agents is solution. This represents your solution agent. Try running it in a game against the random agent:
cargo run --release -- --x solution --o random
You will notice that the game crashes with an not implemented error. Your task is to implement this agent! Your implementation should go in tic_tac_toe_3x3/src/solution/agent.rs.
(regular) tic tac toe is a solved game, there’s a well known strategy (you can find it online!) that ensures you never lose – you will in the worst case draw regardless of whether you are X or O. However, we do not want you to implement this strategy. This strategy only works for regular tic tac toe. It does not work for other games, including the 5x5 variant of tic tac toe you will work on in part 2.
Instead, we want you to implement a general solution/strategy which you will be able to reuse with some tweaks for part 2. This strategy is based on the min-max algorithm – a general algorithm that can play any two player game optimally, provided the state of the game is public (no fog of ware)!
Min Max
Min Max explores all the possible moves that each players can make.
This exploration takes the form of a tree. The starting root of the tree is the current state of the board. Min max finds the next level of possibilities in the tree by applying each possible move to the root independently. It then repeats this process for every new board, until no more moves are available.
When the algorithm reaches a terminal game state (i.e. a board with no more moves or a board with 3 consecutive Xs or Os), it assigns a score to the board based on who won. In this implementation, we assign a score of 1 when X wins, -1 when O wins, and 0 for draw.
The algorithm then propagates the score back up the tree all the way to the root. When the algorithm is going up moves made by player X, it selects the move that yields the maximum score, when the move is made by player O, it selects the move with the minimum score.
For more exploration of the algorithm, as well as implementations of it in other languages, check these resources out:
- https://www.geeksforgeeks.org/minimax-algorithm-in-game-theory-set-1-introduction/
- https://www.youtube.com/watch?v=5y2a0Zhgq0U
Feel free to follow this high-level implementation plan:
- Check whether the game is over by calling
board.game_over(), if it is, return the score and zero move. - If the game is not over, get all the possible/available moves using
board.moves(). - Apply each move to the board, then recursively call the solution function on that modified board, this will give you the score associated with that move. Find the move with the score that is best for the current player.
- Hint: when you apply the move to the board, you will change the board, consider cloning it to keep the original board in case you need it later.
- Hint: when you recursively call the solution function, think about whose turn it is then.
- Hint: your agent has a time limit. This limit is generous for 3x3 tic tac toe. However, consider optimizing your code a little – see if you can find a way to get rid of needless clones.
Testing Your Solution
Play a couple of rounds against your agent. Importantly, you must never be able to defeat it. If you did, then the agent has a bug that you need to fix.
cd tic_tac_toe_3x3
# Your solution is X, you are O
cargo run --release -- --x solution --o manual
# Your solution is O, you are X
cargo run --release -- --x manual --o solution
After that, consider testing your agent using the provided tests. Start by testing it against the first move agent:
cd tic_tac_toe_3x3
cargo test --release --test first
The tests will sometimes run your solution as player X and sometimes as player O. A common bug students face is that they write their solution assuming they are player X (thus, always trying to maximize the score). Remember to look at the player argument the stencil provides to your solution to decide whether to maximize or minimize in order to avoid this bug.
A correct solution should pass all these tests. If it does not, the error message will show you the game history including every move your agent took. Look at it and see if you can use that information to detect and fix bugs in your solution.
You should also test your agent against the remaining tests we provide:
cd tic_tac_toe_3x3
cargo test --release
To get full credit for this part, you solution must pass all the tests. Additionally, it must never lose to our super secret and all powerful testing agent which we have not provided to you. (The super secret agent itself also uses min max!)
Part 2: 5x5 Tic Tac Toe with Random Walls
For this part, you will work on https://github.com/rust4ds/ds210-sp26-a1-code/tree/main/project_4_tic_tac_toe/tic_tac_toe_5x5, which is the stencil for a variant of 5x5 tic tac toe game with random walls.
The purpose of the game is to get more consecutive sequences X’s or O’s than your opponent. Each 3 consecutive X’s increase the X player score by 1, and similarly for the O player. The player with the higher score wins. The sequence can be over a row, column, or diagonal.
As in the previous part, we will not tell you how to split the work between the team members, but we will check your commit history and we expect all team members to contribute equally.
Step 0 - Getting Started
We expect your solution to be on a branch called tictactoe5. It is a good idea to branch of your part 1 branch (tictactoe3) so that you
can build on top of your part 1 solution without modifying your part 1 submission.
After you branch out, you should also synchronize and merge with the main branch of our repo to have out latest changes.
Now, start by observing how this new game variant works.
First, run two random agents against each other:
cd tic_tac_toe_5x5
cargo run --release -- --x random --o random --layout 5
Observe the following:
- The game is not over when you get 3 Xs (or Os) in a row. Instead, it continue until all cells are occupied.
- The player with the most number of 3 symbols in a row wins.
- Having 4 consecutive Xs (e.g., in a row, for example, X X X X W) increases the score by 2 (one for each of X X X X W and X X X X W).
Second, play a few games against the random agent.
cargo run --release -- --x manual --o random --layout 5
When you find yourself consistently winning, try a few games against our test agent!
cargo run --release -- --x manual --o test --layout 5
In general, you can run the stencil code with different agents and for different board layouts!
cargo run --release -- --x <agent> --o <agent> --layout <layout>
agent can be any of first, random, test, manual, and solution.
layout can either be 3x3, or a number between 0 and 10. The first is a premade
layout that is identical to a regular 3x3 tic tac toe. The later correspond to a layout
with as many walls as the provided number, which are randomly placed over the board.
Step 1 - Naive Solution
Start by copying over your 3x3 solution to tic_tac_toe_5x5/src/solution/agent.rs. Your solution should be compatible with 5x5 if you implemented in a general enough way.
If your solution makes any hardcoded uses of the dimensions of the board or its state, you may need to modify that to make it compatible with the 5x5 bigger board.
Your solution should work OK for the 3x3 layout. Try it out.
cargo run --release -- --x solution --o random --layout 3x3
In fact, your solution should pass all of our 3x3 tests!
cargo test --release --test threebythree
What about other layouts? We can try:
cargo test --release
Depending on how fast your computer is, you may see one of these outcomes:
- You have a fast computer and all tests pass.
- Your computer is on the slow side, and some tests do not pass because your agent exceeds the time limit.
Notice that we run these tests in release mode, where the Rust compiler optimizes the program as much as possible. However, our tests and tournament will run your program without release mode –
because we want you to get experience manually optimizing your program both in terms of implementation and algorithms!
To mimic our test and tournament setup, run the tests without --release!
cargo test ---test threebythree
cargo test
For most of you, the naive 3x3 solution should pass the threebythree tests but time out and forfeit on the remaining tests. We will need to optimize the solution in the next steps to make it work!
Important: Calibrating the time limit on your computer
If your naive solution passes the threebythree tests without --release and without needing to adjust anything in the code, you can skip over to the next step.
For folks with slow computers: your solution may not even pass threebythree, in which case, you should increase the time limit. By default, this limit is set to 2 seconds (2000 milliseconds).
Try increasing it to 3, 4, 5, … You can increase the time limit by changing line 6 in tic_tac_toe_5x5/tests/threebythree.rs. Choose the smallest number that allows your tests to pass.
After you find the time limit that is appropriate to your computer, make sure you update the time limit consistently in all the other test files under tic_tac_toe_5x5/tests/ and in tic_tac_toe_5x5/src/main.rs.
About the tests and tournament setup: we have dedicated a fast machine to run your code for the tests and tournament, which will give you a little wiggle room. Your machine may be slower, the above approach to increasing the time limit acts as a calibration. As long as your submission passes all the tests within the calibrated time limit, you will be good to go for our tests and tournament.
Step 2 - Limit Exploration Depth and Implement a Heuristic
Modify your implementation to only explore possibilities up to a fixed depth. Try to set the depth to be the biggest possible value with which your implementation still succeeds within the time limit. Depending on how optimized your code is (e.g., whether it clones the board or is clone free), this depth limit will likely be somewhere between 3 and 6. Exploring 4 levels deep (i.e. two moves by X and two moves by O) within the time limit should be sufficient to pass the tests, but you may want to optimize your code so you can explore more levels/depth for the tournament.
You can find an example of depth-limiting your min-max algorithm in different languages here.
Specifically, note how the minimax helper function in any of the languages in that link takes a current depth and a target or maximum depth as parameters, and it compares whether the target
depth was reached before doing any minimizing or maximizing.
Hint: you should create a new helper function that tracks the depth in a similar way, and when the maximum target depth is reached, it should stop exploring deeper moves, and instead rely on a heuristic function (see below) to assign a predicted score for the board and return it.
Heuristic function: After exhausting the maximum depth or number of levels, your modified implementation needs to assign the board an approximate score, even when it is not game over. To do that we use a heuristic function (sometimes called an evaluation function).
Add a heuristic function to tic_tac_toe_5x5/src/solution/agent.rs to assign scores to incomplete boards.
Hint 1: The heuristic function signature should be fn heruistic(board: &Board) -> i32.
Hint 2: board.score() is a good starting heuristic. This will not tell you exactly what the final score of following this path will be, but it tells you whether the current state favors one player more than the other.
Try to understand what that function does. This should get you to pass most, if not all, of our tests.
You should test your modified agent on the full board, not the 3x3 layout. You can do that using this command:
cargo run -- --x solution --o first --layout 5
cargo test
Step 3 - Make your Heuristic more Intelligent
Play the game a few times and try to develop a strategy for how to evaluate the potential on the board, beyond just the current score. For example, having 2 consecutive symbols in a row may be more valuable than having none.
You can also consider whether the symbols are placed near empty cells (giving the player a chance to expand) vs near occupied cells (walls or other player cells) or near the edge of the board!
Code your heuristic up and ensure it passes all the provided tests. Your agent plus intelligent heuristic should perform better than the version of the agent that uses the board score as a heuristic. In fact, we will tests your agent against that solution and we expect you to consistently beat it to receive full grade.
You can test this by playing against the agent. You can also modify files under tic_tac_toe_5x5/src/ (e.g., main.rs, args.rs) so that it can run different versions of your solution against each other (e.g., consider saving old versions using a new file/new struct with a different name).
Ensure that you’re agent is functional when it’s used for the O player as well. The provided tests contains cases that test this, but test it manually as well just to be sure. We will play your agent as player X and player O during the tournament (more on this later).
Submission
When you are happy with your solution, make sure you have pushed it to a branch called tictactoe5 and submit via Gradescope.
You will also need to submit a team name that you will use for our tournament. The team name must start with a letter, and can only contain English letters, digits, and underscores – no spaces, no special symbols.
If you want to also use your part 2 submission for the first run of our tournament, branch off to a new branch called tournament and push your code there.
For full grade for part 2 your solution needs to pass all the tests without --release within the time limit on our auto grader machine. It will also need to consistently beat our naive solution (that matches step 2 above).
Part 3: Tournament
After part 2 is due, we will run a tournament daily at mid night where we play all your solutions against each other. The team with most wins will lead the board, and we will break ties separately.
You will be able to check the scores online – we will post the link to Piazza.
You can look at the results and games your agent performed and use that info to improve your agent. You can submit your improvements to the next tournament by pushing your code to a branch called tournament.
We will continue to run the tournament daily until May 1st, pulling in your most recent code from GitHub automatically. Our last run will be on 11:59 PM on May 1st. The results from that tournament will be our final standings! This is not cumulative – the standings reset between each run. So the team with the best agent at the end wins!
The tournament will also contain two agents created by the professor. One is a medium-difficulty and the second is an intelligent agent that the professor spent around half a day designing.
For full grade for the tournament your solution to rank higher than the medium professor agent.
Bonus Credit: All teams that additionally beat the harder professor agent will receive a 10% bonus for this project. Furthermore, the top three teams will receive an additional 5%, 10%, and 15% (for a total of 25%) provided they also beat the hard professor agent.
How to improve your solutions
There is almost an unlimited number of ways you can improve your solutions. Be creative. Here are some ideas:
- Make your heuristic smarter:
- Encode a smarter strategy for evaluating how good a board is.
- Consider using different heuristic functions for different stages of the game (e.g., in the beginning vs the end game!).
- If you are really ambitious, you could even generate and save many (thousands) of games by running your naive agents against each other, and use machine learning to learn a better heuristic. Try this at your own risk.
- Optimize your code to explore more depth within the time limit:
- Avoid clones and recomputing vectors or heuristics if you can.
- Consider implementing alpha beta pruning.
- Your agent receives the time limit as a parameter, consider exploring as deep as possible until the time limit is dangerously close, and returning the best move you found.
- Similarly, consider the effects of the game state on the maximum depth you can support.
- Redesign your solution so it does not use recursion: plain iteration is faster than recursion. Consider using a queue to perform a breadth first search traversal.
- Consider using parallelism or multi-threading to explore the search space quicker.