COMP338 · Artificial Intelligence · Birzeit University

Games and adversarial search

Interactive companion · Chapter 5
Russell & Norvig · AIMA (3rd ed.) · Jarrar Ch.5

Lecture slides · Mustafa Jarrar (Birzeit)

This companion follows the course's structure. The original lecture-note decks for Chapters 1–5 are linked below; this page expands Chapter 5 with worked, interactive versions of every idea.

Section I

Why games?

search, but the world fights back

Every search problem so far had one mover: you. You chose every action, and the world sat still between your moves. A game breaks that assumption. Between your moves, an opponent moves, and the opponent wants exactly what you do not. Planning now means planning against a mind that is planning against you.

The setting we study

Chapter 5 concentrates on the cleanest, most-studied class of games, the one chess, checkers, and tic-tac-toe all belong to. A game in this class is:

  • Two-player. Exactly two agents move. We name them MAXThe player we are deciding for; MAX tries to maximise the final utility.اللاعب الأعظم (the player we are deciding for) and MINThe opponent; MIN tries to minimise MAX's utility.اللاعب الأصغر (the opponent).
  • Turn-taking. MAX moves, then MIN, then MAX, alternating.
  • Deterministic. A move has one fixed outcome. No dice, no shuffled deck.
  • Perfect information. Both players see the whole board at all times. Nothing is hidden.
  • Zero-sum. One player's gain is exactly the other's loss. Whatever is good for MAX by some amount is bad for MIN by the same amount. There is no outcome that is good for both.
Why this class firstCards (hidden hands) and backgammon (dice) break perfect information and determinism. They need extra machinery (chance nodes, beliefs). The deterministic, perfect-information case already contains the two ideas that matter most (minimax and alpha–beta pruning), so we master it first. The motivating slide opens with cards, checkers, tic-tac-toe and chess precisely to ask which of these can we plan ahead in; the answer for the deterministic ones is: all of them, in principle.

Games as a search problem, and why they are hard

Playing a game is a search: from the current position, look ahead through the moves available to you, the replies available to the opponent, your replies to those, and so on, until you can tell winning lines from losing ones. The search space is a game treeA tree whose root is the current position, whose edges are legal moves, and whose levels alternate between the two players.شجرة اللعبة.

The catch is size. Chess has an average branching factorThe average number of legal moves available at a position; the average number of children of a node.معامل التفرّع of about b ≈ 35, and a game runs to roughly m ≈ 100 ply. A full game tree therefore holds about 35100 ≈ 10154 nodes, more than the number of atoms in the observable universe. We can never build it. So the real problem is not "search the tree" but:

The two real problems(1) Given the part of the tree we can afford to look at, what is the best move, assuming the opponent plays as well as possible? That is minimax. (2) How do we avoid wasting time on parts of the tree that cannot change the answer (alpha–beta pruning), and how do we stop early and still judge a position well (evaluation functions)?
Section II

The game tree

states, plies, and who wins at the leaves

A game tree turns "what could happen" into a picture. The root is where we are now. Each downward step is one legal move by whoever is to play. Levels alternate owners: MAX chooses at one level, MIN at the next. At the bottom, the game is over and we can say, plainly, who won.

Six parts, like any search problem

A game is defined by the same kind of schema we used for ordinary search, with two twists: a function that says whose turn it is, and a utility on terminal states instead of a goal test.

Initial state S₀
The starting position (e.g. the empty tic-tac-toe board).
PLAYER(s)
Whose turn it is to move in state s: MAX or MIN.
ACTIONS(s)
The set of legal moves in s.
RESULT(s, a)
The state reached by playing move a in s (the transition model).
TERMINAL‑TEST(s)
True when the game is over (win, loss, or draw).
UTILITY(s, p)
A number giving the outcome for player p at a terminal state s. For tic-tac-toe: +1 win, 0 draw, −1 loss, read from MAX's point of view.
Ply, not "move"One step by one player is a plyA single move by a single player. A "move" in everyday speech (white then black) is two plies.نِصف نقلة. Each level of the tree is one ply, so the levels alternate MAX, MIN, MAX, MIN… The tic-tac-toe tree in the lecture notes shows MAX(X) choosing among 9 first moves, MIN(O) replying, and so on, down to TERMINAL boards labelled −1, 0, +1.

Zero-sum, drawn as one number

Because the game is zero-sum, we do not need two utilities. One number suffices, read from MAX's perspective: high is good for MAX, low is good for MIN. MAX climbs, MIN drags down. Every value in every tree below is "goodness for MAX". This single convention is what makes the next algorithm so short.

Section III

Minimax

assume the opponent is perfect, then play your best

Minimax answers one question: if the opponent always replies as well as they possibly can, what is the best I can guarantee? It computes a value for every node by reading the tree from the leaves upward (MAX nodes take the maximum of their children, MIN nodes take the minimum), and then MAX plays toward the child that produced the root value.

The rule, in one line each

MINIMAX(s) =   UTILITY(s) // if s is terminal   max over a of MINIMAX(RESULT(s,a)) // if PLAYER(s) = MAX   min over a of MINIMAX(RESULT(s,a)) // if PLAYER(s) = MIN

Read it as a recipe with four steps, exactly as the notes phrase it for the MAX player: (1) generate the tree as deep as time permits; (2) apply the utility / evaluation function to the leaves; (3) back the values up, minimum at MIN nodes and maximum at MAX nodes; (4) at the root, MAX chooses the move that led to the highest backed-up value.

Watch it back values up

Below is the canonical three-level tree (root MAX, then MIN, then the leaves), the same shape the lecture uses for its worked examples. Step through it: leaves are read left to right, each MIN node settles on the smallest of its children, and the root MAX takes the largest of the MIN values. The thick red line is the principal variationThe line of best play for both sides: MAX's best move and MIN's best replies. Its value is the minimax value of the root.المسار الرئيسي: best play for both sides.

How to read this diagramUp-triangles are MAX nodes, down-triangles are MIN nodes (the left margin labels each row). The number that appears above a node is the value backed up to it; the numbers under the bottom row are the given leaf utilities. Press Step to fill them in one node at a time.
Why MIN helps MAX decideMAX does not get to pick the leaf it likes best. Whatever child MAX moves to, MIN will then steer toward the worst leaf for MAX in that subtree. So the honest value of a MAX move is the minimum its subtree can be forced to. MAX then picks the move whose forced-worst is the least bad. That double "best of the worst" is the whole idea, and it is why the algorithm is called mini‑max.
Section IV

Properties of minimax

correct, optimal, and far too slow

Minimax is a depth-first walk of the game tree, so its costs are the depth-first costs, with a catch that makes the time unaffordable for any interesting game.

Complete?
Yes
if the tree is finite. It will always return a value.
Optimal?
Yes
against an opponent who also plays optimally. If MIN blunders, MAX does at least as well as minimax promised.
Time
O(bm)
it visits every node: b moves at each of m levels.
Space
O(b·m)
depth-first: only the current path plus siblings are held.
The wallFor chess, b ≈ 35 and m ≈ 100, so bm is astronomically large, so an exact minimax is completely infeasible. Two escapes follow: prune the tree without changing the answer (alpha–beta), and stop early using a heuristic instead of reaching real terminals (evaluation functions). Real engines use both at once.
Section V

Alpha–beta pruning

the same answer, a fraction of the work

Minimax visits every leaf. Most of them cannot possibly change the root's value. Alpha–beta pruning proves, mid-search, that a whole branch is irrelevant and skips it, and it does so without changing the final result at all. It returns exactly what minimax would; it just refuses to waste time.

Two numbers carried down the path

As the search descends, it keeps two bounds, valid for the current path from the root:

α (alpha)
the best (highest) value MAX is already guaranteed somewhere along this path. Start it at −∞. It can only rise.
β (beta)
the best (lowest) value MIN is already guaranteed along this path. Start it at +∞. It can only fall.

The window [α, β] is the range of outcomes still "in play". The cut rules are mirror images:

  • β-cut (at a MAX node). α is raised by each child. If a child makes the node's value ≥ β, stop: the MIN parent above already has a reply at least this good for itself and will never let MAX come here. "If the current max value is greater than the successor's min value, don't explore that subtree any more."
  • α-cut (at a MIN node). β is lowered by each child. If a child makes the node's value ≤ α, stop: the MAX parent above already has a move at least this good and will never come here. "If the current minimum is less than the successor's max value, don't look down that subtree any more."

The same tree, now pruned

This is the very tree from Section III, searched depth-first left to right with alpha–beta switched on. Watch the window [α, β] at each node, and watch whole branches turn to dashed grey the instant they are proven irrelevant. The backed-up root value is identical to plain minimax; count how many leaves were never read.

leaves read: 0 / 0 branches pruned: 0
On the diagram: each node's value sits above it, its [α, β] window sits below it, and marks a cut (the rest of that branch is skipped, shown dashed).
Pruning is bookkeeping, not luckNotice the cut at the second MIN node: as soon as one of its children comes back ≤ α (MAX already has something at least that good on the left), the rest of that MIN node's children are skipped. MIN "knows MAX won't come here, and abandons it." The remaining leaves are never evaluated, yet the root value is exactly what full minimax computed.
Where is the β-cut?This three-level tree can only trigger α-cuts (cuts at MIN nodes), because its only MAX node is the root. The mirror case, a β-cut at a MAX node, needs a deeper tree: open the exercise tree and switch it to alpha–beta mode, whose four levels trigger both kinds of cut.
Section VI

Move ordering and efficiency

good guesses double how deep you can see

Alpha–beta prunes more when good moves are tried first, because a strong early value tightens the window sooner. The order in which children are examined does not change the answer, but it changes how much work the answer costs, dramatically.

From bm toward bm/2

With perfect ordering (best move first at every node), alpha–beta examines only O(bm/2) nodes instead of O(bm). The exponent halves. In plain terms:

The headlinePerfect move ordering lets alpha–beta search to twice the depth of plain minimax in the same time. For chess, that is the difference between looking a few ply ahead and looking far enough to play seriously. Even rough ordering (try captures first, try last iteration's best move first) captures most of the benefit.

Try it: the same tree, with its three top branches reordered. Best-first ordering prunes hardest; worst-first prunes least. The leaves-read counter tells the story.

leaves read: 0 / 0root value:
Section VII

Imperfect real-time decisions

stop early, and judge the position

We cannot reach real terminal states in chess, so we cheat honestly: search to a fixed depth, then estimate how good each cut-off position is with an evaluation functionA heuristic that estimates the expected utility of a non-terminal position without searching further.دالة التقييم. Two changes to minimax make this work.

Cut-off test replaces terminal test

Instead of "is the game over?", we ask "have we searched deep enough?". When yes, the node is treated as a leaf and handed to the evaluation function. The nodes at the cut-off depth become the new leaves; their estimated values are backed up by minimax exactly as before.

The perfect-evaluator thought experimentIf you had a perfect evaluation function (one that returned the true minimax value of any position), you would never need to search more than one level deep: just evaluate each immediate move and pick the best. You cannot build such a function (if you could, the game would be solved), but it shows what the evaluator is approximating, and why a better evaluator is worth more than a little extra depth.

What an evaluation function looks like

It is intensely game-specific and encodes human knowledge. The crudest ("stupid") one returns +1 / 0 / −1 for win/draw/loss, useless until the very end because it is flat everywhere else. Useful evaluators score many features and combine them, almost always as a weighted linear sum:

EVAL(s) = w₁·f₁(s) + w₂·f₂(s) + … + wₘ·fₘ(s)

For chess, the classic features are material and structure: f₁ = (my pieces − your pieces), f₂ = +1 if I have a queen and you do not, a rook-advantage term, and so on. The quality of the evaluator rests on three things: which features you measure, how each is scored, and how they are weighted. And note: the absolute value does not matter, only the relative ordering of positions, because minimax only ever compares siblings.

Try a real one: tic-tac-toe

A simple, genuinely-used tic-tac-toe evaluator counts the lines (rows, columns, diagonals) still open to each player (a line is open for X if it contains no O, and vice-versa):

E(s) = (open lines for X) − (open lines for O) = (rₓ + cₓ + dₓ) − (r○ + c○ + d○)

Click the cells to cycle empty, then X, then O, and watch the value update. This is the number minimax would back up from this position at depth 0. A position where X threatens many lines and blocks O's scores high (good for X = MAX).

open lines for X  (rₓ+cₓ+dₓ)0
open lines for O  (r○+c○+d○)0
E(s) = 0
Section VIII

Play against minimax

the algorithm, made tangible

Tic-tac-toe is small enough to search completely: no cut-off, no evaluation heuristic, just real minimax to true terminal states. So this opponent plays the exact minimax move every time. The theorem says optimal play forces at worst a draw: you cannot win. The best you can do is draw. Try.

You are X. Click any square to move.
What O is doingFor each of its legal replies, O runs minimax to the end of the game and reads back −1 / 0 / +1 from its own perspective, assuming you (X) play perfectly against it. It then plays the move with the highest value, breaking ties toward a quicker win or a longer-delayed loss. Tick "show O's reasoning" to see the minimax score it assigned to every square.
Section IX

Case studies and state of the art

where these ideas actually run

Checkers, with a real evaluation function

The lecture's checkers study uses exactly the weighted-feature shape from Section VII, weighting kings far above single pieces:

E(s) = (5·x₁ + x₂) − (5·r₁ + r₂) x₁,x₂ = black king / single advantage; r₁,r₂ = red king / single advantage

From a given position, minimax (or alpha–beta) is run over black's moves and red's replies, the leaves are scored by E(s), and the values back up to choose black's move, the same procedure you stepped through above, now with a heuristic at the leaves and a king worth five.

The milestones these two ideas built

Beyond the syllabusMinimax + alpha–beta + a strong evaluation function is the recipe behind the headline game-playing programs. Chinook (checkers) became world champion and the game was later solved outright. Deep Blue beat world chess champion Garry Kasparov in 1997 using alpha–beta to a great depth with a hand-tuned evaluation and special-purpose hardware. Games with huge branching factors (Go) ultimately needed a different search (Monte-Carlo tree search with learned evaluations, as in AlphaGo, 2016), but the framing is the same: search the game tree, evaluate where you must stop, and assume a strong opponent. That last idea is the durable lesson of this chapter.
Section X

Culminating exercise

Jarrar's tree, end to end

This is the exercise tree from the lecture notes. Four levels (max, min, max, min from the top), with the leaf utilities given. Work it on paper first: back the values up, then decide which branches alpha–beta would cut under left-to-right search. Then step the widget and check yourself.

The tree

Leaves, left to right: E=10, F=11, H=9, I=12, L=14, M=15, O=13, P=14, T=15, V=2, W=4, X=1, Z1=3, Z2=22, Z4=24, Z5=25. Root A is a MAX node.

root value: leaves read: 0 / 0
Self-check (click to reveal)The root value is 10. Under minimax the left half resolves to 10 (B = min(C=10, J=14) = 10) and the right half to 4 (Q = min(R=4, Y=…) = 4), so A = max(10, 4) = 10. Under left-to-right alpha–beta several branches cut once α reaches 10 on the left: subtrees that cannot beat 10 are abandoned (for example, once Q is bounded at or below 10 from MAX's view it stops mattering, and inner MIN nodes that drop to or below the running α are cut). Reveal the exact cuts by stepping the widget in alpha–beta mode.
Section XI

Cheat sheet

everything on one screen
 MinimaxAlpha–Beta
Returnsexact minimax valuesame exact value
IdeaMAX maximises, MIN minimises; back up from leavesskip branches that cannot change the result
Carriesjust node valuesalso α (best for MAX) and β (best for MIN) along the path
Cut ruleat MAX: stop if value ≥ β; at MIN: stop if value ≤ α
TimeO(bm)O(bm) worst, O(bm/2) with perfect ordering
SpaceO(b·m)O(b·m)
Complete / Optimalyes / yes (vs optimal MIN)yes / yes (identical to minimax)
When the tree is too big (always, for real games)Add a cut-off test (search to depth d) and an evaluation function at the cut-off leaves. Initialise α = −∞, β = +∞; at a MAX node raise α, cut when value ≥ β; at a MIN node lower β, cut when value ≤ α. Order moves best-first to prune hardest. That five-part combination is a working game engine.
Reference

Glossary

terms, with Arabic
Adversarial search البحث التنافسي
Search in an environment where another agent's goals oppose yours, so the value of a state depends on the opponent's best reply.
MAX / MIN الأعظم / الأصغر
The two players. MAX maximises the (single, zero-sum) utility; MIN minimises it.
Game tree شجرة اللعبة
Tree rooted at the current position; edges are legal moves; levels alternate between MAX and MIN.
Ply نِصف نقلة
One move by one player; one level of the game tree.
Utility / payoff المنفعة
The numeric outcome at a terminal state, read from MAX's perspective (e.g. +1 / 0 / −1).
Zero-sum محصّلته صفر
One player's gain equals the other's loss; one number describes the outcome for both.
Minimax value قيمة المينيماكس
The utility MAX can guarantee against an optimal MIN; the value backed up to the root.
Principal variation المسار الرئيسي
The line of best play for both sides; the path of moves that realises the minimax value.
Alpha (α) ألفا
Best value found so far for MAX along the current path; lower bound on the result. Starts at −∞.
Beta (β) بيتا
Best value found so far for MIN along the current path; upper bound on the result. Starts at +∞.
Pruning / cut-off التقليم
Skipping a branch proven unable to affect the root value (alpha–beta); or stopping the search at a depth limit (cut-off test).
Evaluation function دالة التقييم
Heuristic estimate of a non-terminal position's value, usually a weighted sum of game features. Only relative values matter.
Branching factor (b) معامل التفرّع
Average number of legal moves per position (chess ≈ 35).
Move ordering ترتيب النقلات
The order children are examined; trying strong moves first lets alpha–beta prune more, toward O(bm/2).