Why games?
Every search problem so far had one mover: you. You chose every action, and the world sat still between your moves. A game breaks that assumption. Between your moves, an opponent moves, and the opponent wants exactly what you do not. Planning now means planning against a mind that is planning against you.
The setting we study
Chapter 5 concentrates on the cleanest, most-studied class of games, the one chess, checkers, and tic-tac-toe all belong to. A game in this class is:
- Two-player. Exactly two agents move. We name them MAXThe player we are deciding for; MAX tries to maximise the final utility.اللاعب الأعظم (the player we are deciding for) and MINThe opponent; MIN tries to minimise MAX's utility.اللاعب الأصغر (the opponent).
- Turn-taking. MAX moves, then MIN, then MAX, alternating.
- Deterministic. A move has one fixed outcome. No dice, no shuffled deck.
- Perfect information. Both players see the whole board at all times. Nothing is hidden.
- Zero-sum. One player's gain is exactly the other's loss. Whatever is good for MAX by some amount is bad for MIN by the same amount. There is no outcome that is good for both.
Games as a search problem, and why they are hard
Playing a game is a search: from the current position, look ahead through the moves available to you, the replies available to the opponent, your replies to those, and so on, until you can tell winning lines from losing ones. The search space is a game treeA tree whose root is the current position, whose edges are legal moves, and whose levels alternate between the two players.شجرة اللعبة.
The catch is size. Chess has an average branching factorThe average number of legal moves available at a position; the average number of children of a node.معامل التفرّع of about b ≈ 35, and a game runs to roughly m ≈ 100 ply. A full game tree therefore holds about 35100 ≈ 10154 nodes, more than the number of atoms in the observable universe. We can never build it. So the real problem is not "search the tree" but:
The game tree
A game tree turns "what could happen" into a picture. The root is where we are now. Each downward step is one legal move by whoever is to play. Levels alternate owners: MAX chooses at one level, MIN at the next. At the bottom, the game is over and we can say, plainly, who won.
Six parts, like any search problem
A game is defined by the same kind of schema we used for ordinary search, with two twists: a function that says whose turn it is, and a utility on terminal states instead of a goal test.
Zero-sum, drawn as one number
Because the game is zero-sum, we do not need two utilities. One number suffices, read from MAX's perspective: high is good for MAX, low is good for MIN. MAX climbs, MIN drags down. Every value in every tree below is "goodness for MAX". This single convention is what makes the next algorithm so short.
Minimax
Minimax answers one question: if the opponent always replies as well as they possibly can, what is the best I can guarantee? It computes a value for every node by reading the tree from the leaves upward (MAX nodes take the maximum of their children, MIN nodes take the minimum), and then MAX plays toward the child that produced the root value.
The rule, in one line each
Read it as a recipe with four steps, exactly as the notes phrase it for the MAX player: (1) generate the tree as deep as time permits; (2) apply the utility / evaluation function to the leaves; (3) back the values up, minimum at MIN nodes and maximum at MAX nodes; (4) at the root, MAX chooses the move that led to the highest backed-up value.
Watch it back values up
Below is the canonical three-level tree (root MAX, then MIN, then the leaves), the same shape the lecture uses for its worked examples. Step through it: leaves are read left to right, each MIN node settles on the smallest of its children, and the root MAX takes the largest of the MIN values. The thick red line is the principal variationThe line of best play for both sides: MAX's best move and MIN's best replies. Its value is the minimax value of the root.المسار الرئيسي: best play for both sides.
Properties of minimax
Minimax is a depth-first walk of the game tree, so its costs are the depth-first costs, with a catch that makes the time unaffordable for any interesting game.
Alpha–beta pruning
Minimax visits every leaf. Most of them cannot possibly change the root's value. Alpha–beta pruning proves, mid-search, that a whole branch is irrelevant and skips it, and it does so without changing the final result at all. It returns exactly what minimax would; it just refuses to waste time.
Two numbers carried down the path
As the search descends, it keeps two bounds, valid for the current path from the root:
The window [α, β] is the range of outcomes still "in play". The cut rules are mirror images:
- β-cut (at a MAX node). α is raised by each child. If a child makes the node's value ≥ β, stop: the MIN parent above already has a reply at least this good for itself and will never let MAX come here. "If the current max value is greater than the successor's min value, don't explore that subtree any more."
- α-cut (at a MIN node). β is lowered by each child. If a child makes the node's value ≤ α, stop: the MAX parent above already has a move at least this good and will never come here. "If the current minimum is less than the successor's max value, don't look down that subtree any more."
The same tree, now pruned
This is the very tree from Section III, searched depth-first left to right with alpha–beta switched on. Watch the window [α, β] at each node, and watch whole branches turn to dashed grey the instant they are proven irrelevant. The backed-up root value is identical to plain minimax; count how many leaves were never read.
Move ordering and efficiency
Alpha–beta prunes more when good moves are tried first, because a strong early value tightens the window sooner. The order in which children are examined does not change the answer, but it changes how much work the answer costs, dramatically.
From bm toward bm/2
With perfect ordering (best move first at every node), alpha–beta examines only O(bm/2) nodes instead of O(bm). The exponent halves. In plain terms:
Try it: the same tree, with its three top branches reordered. Best-first ordering prunes hardest; worst-first prunes least. The leaves-read counter tells the story.
Imperfect real-time decisions
We cannot reach real terminal states in chess, so we cheat honestly: search to a fixed depth, then estimate how good each cut-off position is with an evaluation functionA heuristic that estimates the expected utility of a non-terminal position without searching further.دالة التقييم. Two changes to minimax make this work.
Cut-off test replaces terminal test
Instead of "is the game over?", we ask "have we searched deep enough?". When yes, the node is treated as a leaf and handed to the evaluation function. The nodes at the cut-off depth become the new leaves; their estimated values are backed up by minimax exactly as before.
What an evaluation function looks like
It is intensely game-specific and encodes human knowledge. The crudest ("stupid") one returns +1 / 0 / −1 for win/draw/loss, useless until the very end because it is flat everywhere else. Useful evaluators score many features and combine them, almost always as a weighted linear sum:
For chess, the classic features are material and structure: f₁ = (my pieces − your pieces), f₂ = +1 if I have a queen and you do not, a rook-advantage term, and so on. The quality of the evaluator rests on three things: which features you measure, how each is scored, and how they are weighted. And note: the absolute value does not matter, only the relative ordering of positions, because minimax only ever compares siblings.
Try a real one: tic-tac-toe
A simple, genuinely-used tic-tac-toe evaluator counts the lines (rows, columns, diagonals) still open to each player (a line is open for X if it contains no O, and vice-versa):
Click the cells to cycle empty, then X, then O, and watch the value update. This is the number minimax would back up from this position at depth 0. A position where X threatens many lines and blocks O's scores high (good for X = MAX).
Play against minimax
Tic-tac-toe is small enough to search completely: no cut-off, no evaluation heuristic, just real minimax to true terminal states. So this opponent plays the exact minimax move every time. The theorem says optimal play forces at worst a draw: you cannot win. The best you can do is draw. Try.
Case studies and state of the art
Checkers, with a real evaluation function
The lecture's checkers study uses exactly the weighted-feature shape from Section VII, weighting kings far above single pieces:
From a given position, minimax (or alpha–beta) is run over black's moves and red's replies, the leaves are scored by E(s), and the values back up to choose black's move, the same procedure you stepped through above, now with a heuristic at the leaves and a king worth five.
The milestones these two ideas built
Culminating exercise
This is the exercise tree from the lecture notes. Four levels (max, min, max, min from the top), with the leaf utilities given. Work it on paper first: back the values up, then decide which branches alpha–beta would cut under left-to-right search. Then step the widget and check yourself.
The tree
Leaves, left to right: E=10, F=11, H=9, I=12, L=14, M=15, O=13, P=14, T=15, V=2, W=4, X=1, Z1=3, Z2=22, Z4=24, Z5=25. Root A is a MAX node.
Cheat sheet
| Minimax | Alpha–Beta | |
|---|---|---|
| Returns | exact minimax value | same exact value |
| Idea | MAX maximises, MIN minimises; back up from leaves | skip branches that cannot change the result |
| Carries | just node values | also α (best for MAX) and β (best for MIN) along the path |
| Cut rule | – | at MAX: stop if value ≥ β; at MIN: stop if value ≤ α |
| Time | O(bm) | O(bm) worst, O(bm/2) with perfect ordering |
| Space | O(b·m) | O(b·m) |
| Complete / Optimal | yes / yes (vs optimal MIN) | yes / yes (identical to minimax) |