The Problem-Solving Agent
Imagine you are on holiday in Romania, currently in Arad, and your flight leaves tomorrow from Bucharest. The towns are linked by a network of roads. How do you decide which road to take from Arad? You cannot see Bucharest from Arad. You need a plan.
This is the situation of a problem-solving agentA goal-based agent that decides what to do by finding sequences of actions that lead to desirable states.عَميل حلّ المشكلات: an agent that decides what to do by considering future actions and their outcomes. It is one specific kind of goal-based agent. The problem-solving agent operates in four phases, in order:
1. Goal formulation
The agent decides what counts as success. "Be in Bucharest tomorrow" becomes a goal state. Goals organise behaviour by ruling out actions that don't help.
2. Problem formulation
Given the goal, the agent decides what actions and states to consider. This abstraction step is what makes search possible.
3. Search
Examine sequences of actions that lead to states of known value, then choose the best sequence. This produces a plan.
4. Execution
Carry out the actions of the chosen plan, one by one. If the environment is deterministic and observable, no further sensing is needed during execution.
Where search shows up in practice
Even today, when neural networks dominate AI, the problem-solving paradigm is everywhere. Satnav uses A* on the road graph. Logistics planners (Amazon, Aramex, BRT) use search to route packages. Game engines (chess, Go, even AlphaGo's tree-search backbone) use it to look ahead. Compiler optimisation, theorem proving, robot motion planning, protein folding, all are search problems wearing different clothes. Understanding the basics here is the foundation for everything else in this course.
Problem Formulation
A search problem must be expressed in a form the algorithm can work on. Russell and Norvig give a six-component schema that captures every search problem in this course, from finding a route to playing chess to solving the 8-puzzle.
The Six Components المُكوِّنات الستة
State
A configuration of the world that is relevant to the problem. In route-finding, a state is a city. In the 8-puzzle, it is the positions of the tiles plus the blank. The choice of what counts as a state is the heart of formulation: too much detail and the space explodes; too little and the answer is meaningless.
Initial State
The state in which the agent starts. The search begins here. For our running example: In(Arad).
Successor Function
Given a state x, returns the set of <action, successor> pairs reachable in one step. This implicitly defines the legal moves of the problem and, together with the initial state, defines the state spaceThe set of all states reachable from the initial state by any sequence of actions.فضاء الحالات.
Goal Test
A function that, applied to a state, returns true if the state is a goal. The test can be explicit (a specific state, such as In(Bucharest)) or a property (such as checkmate in chess).
Path Cost
A numeric cost assigned to a path. Usually it is the sum of step costs c(x, a, y) along the path. We assume each step cost is non-negative. For Romania, the step cost is the road distance in kilometres.
Solution
A sequence of actions that leads from the initial state to a goal state. An optimal solution is one whose path cost is the lowest among all solutions.
Worked Example · Romania مثال: التنقّل في رومانيا
The agent is in Arad, the goal is to reach Bucharest. The state space is the set of cities; the successor function returns neighbouring cities reachable by a labelled road; the goal test asks In(Bucharest)?; the path cost is the sum of road distances. Click any city below to see its <action, state> successor pairs.
Worked Example · The 8-Puzzle مثال: لعبة الثمانية
A 3×3 board with eight numbered tiles and one blank square. The blank may move up, down, left or right (equivalently: an adjacent tile slides into the blank). The goal is a specific target configuration.
Start State
Goal State
Formulation
- State: location of the eight tiles and the blank.
- Initial state: any configuration.
- Successor function: apply one of four blank moves (Up, Down, Left, Right). Returns the new state.
- Goal test: does the state match the goal?
- Path cost: 1 per move (each move counts equally).
There are 9!/2 = 181,440 reachable states. Brute force is feasible here, but the 15-puzzle has ≈ 1013 states, and the 24-puzzle ≈ 1025: heuristics become essential.
Real-world problem formulations أمثلة عمليّة من الواقع
Romania and the 8-puzzle are textbook examples chosen to fit on a page. The same six-component schema scales to genuinely industrial problems. Four short formulations follow, drawn from Jarrar's Ch3 pages 10-15.
Travelling Salesperson (TSP)
Given a set of cities and pairwise distances, find the shortest tour that visits every city exactly once and returns to the start.
- State: current city, plus the set of cities already visited.
- Initial state: the starting city, no other city visited.
- Successors: move to any city not yet visited.
- Goal test: all cities visited AND the agent is back at the starting city.
- Path cost: total tour length (sum of inter-city distances).
VLSI layout
Place electronic components on a chip and route wires between them. Each chip has billions of transistors; the layout has dramatic effects on speed, power, and yield.
- State: positions of components and wires on the chip.
- Initial state: either no components placed (incremental) or all components placed at random (complete-state).
- Successors: place a component, route a wire (incremental); or move/swap components and re-route (complete-state).
- Goal test: all components placed and connections routed without overlap, meeting timing and area constraints.
- Path cost: some combination of total wire length, congestion, and number of layer changes.
Robot navigation
A mobile robot must move from a start configuration to a target configuration in an environment that may include obstacles and other moving agents.
- State: robot's position (and orientation), plus the positions of its actuators.
- Initial state: the robot's starting configuration (task-dependent).
- Successors: apply a movement or an actuator action.
- Goal test: the robot has reached the target configuration (task-dependent).
- Path cost: often a weighted sum of distance travelled, energy consumed, and time taken.
Automatic assembly
A robot assembles a product from parts, one component at a time, respecting physical constraints (a part can only be attached when its mating surfaces are accessible).
- State: which components have been placed and where.
- Initial state: no components placed.
- Successors: place the next compatible component in a feasible position.
- Goal test: the product is fully assembled.
- Path cost: number of moves, or weighted assembly time.
The same problems can be tackled by very different algorithms depending on their formulation. TSP and VLSI are typically attacked by local search (Part Ten) because the state space is gigantic and the path of intermediate states is irrelevant: we just want a good final assignment. Route finding and assembly sequencing, where the sequence of actions matters, lean on systematic search.
The Search Tree
Once we have a formulation, the algorithm starts from the initial state and applies the successor function repeatedly, building a tree of partial plans. This is the search tree. Each branch represents an action; each node represents a state reached by a particular sequence of actions.
States versus Nodes الحالات مقابل العقد
A common source of confusion: a stateA configuration of the world. Two different paths to the same configuration share the same state.الحالة is a configuration of the world, while a nodeA bookkeeping record in the search tree, containing a state plus the parent, the action that produced it, the depth, and the path cost.العقدة is a bookkeeping record in the search tree. A node has five fields:
- STATE, the state this node corresponds to
- PARENT-NODE, the node that generated this one
- ACTION, the action applied to the parent's state to produce this node
- PATH-COST, written g(n), the cost from the root to this node
- DEPTH, the number of steps from the root
Two different nodes may correspond to the same state if there are two paths to it from the start. The state space is a graph; the search tree the algorithm builds may visit the same state through different branches. This is the source of all the trouble we are about to discuss.
Expansion توسيع العقدة
The algorithm grows the tree by expandingApplying the successor function to a node to generate its children, which are then added to the frontier.توسيع a node: it applies the successor function to that node's state, generates one new node per resulting <action, state> pair, and adds those new nodes to the collection of unexplored options. This collection is called the frontier or fringe.
Search terminology, in one place مفردات البحث
These terms recur in every algorithm below; pinning them down once avoids confusion later. The same node passes through three sets in its lifetime: it is first generated, then reached, then expanded.
Generated
A node has been generated the moment the successor function produces it. The total count of generated nodes is what the time-complexity formulas count.
Reached
A state is reached if a node for that state has been generated and placed somewhere we can find it (fringe or closed list). In graph-search, we generate a state at most once because we check against the reached set.
Expanded (visited)
A node is expanded when its successor function is applied. Synonymously called visited. The number of expansions is the main metric algorithms compare on. After expansion, the node moves from the fringe to the closed list.
Fringe (frontier, open list)
The collection of nodes that have been generated but not yet expanded. The algorithm's choice of which fringe node to expand next is what makes each algorithm distinct (FIFO for BFS, LIFO for DFS, priority queue for UCS, A*, etc).
Closed list (reached set)
The set of nodes that have already been expanded. Used by graph-search to recognise when we are about to re-process a state and skip it. Tree-search has no closed list and may revisit states.
Search strategy
The rule by which an algorithm picks the next node to expand from the fringe. Different rules give different algorithms; all of them use the same expansion machinery.
In the interactive demos below, the trace tables show Step, Popped (the node just expanded), Visited (the closed list at that point), and Fringe (what is still queued). Track these columns and the abstract terminology becomes concrete.
Tree-Search versus Graph-Search
The state space is almost always a graph: many states have more than one path leading to them, and some have edges that lead back to where you came from. The search tree the algorithm constructs is a tree: each node has exactly one parent. When the graph has cycles, building a tree by naive expansion can produce a tree that is exponentially larger than the graph, or even infinite. The fix is called graph-search: keep a list of states already expanded and refuse to expand them again.
Below are three worked examples. Each shows the same state space drawn as both a graph and as the search tree that tree-search would explore. The contrast makes the issue concrete.
Example 1 · A simple cycle المثال الأول: حلقة بسيطة
The smallest interesting state space: two cities, A and B, connected by a single bidirectional road. Start at A, goal is unreachable. The graph has just two nodes; the search tree, without a closed list, grows forever.
The state space (graph)
Two states. Each successor of A is B; each successor of B is A. A finite graph.
The search tree (tree-search)
Tree-search has no memory of where it has been. A's child is B; B's child is A; that A's child is B; on forever.
Example 2 · The diamond المثال الثاني: المعيّن
Start at S, goal at G, with two parallel paths through M1 and M2. The graph has four nodes; the search tree tree-search builds has the goal G appearing twice, once from each path.
The state space (graph)
Four states. Two paths to G, of equal or different cost.
The search tree
G appears twice: once as a child of M1, once as a child of M2. With cycles ignored, each instance is a separate node even though it is the same state.
This is wasteful but not infinite. The real problem comes when this duplication happens recursively, deeper in the tree. With d levels of diamonds stacked, the tree has 2d goal nodes; the graph still has just one.
Example 3 · A Romania sub-graph المثال الثالث: جزء من خريطة رومانيا
A more realistic case: the four cities around Arad in the north-west of Romania. The roads form a graph with cycles (Arad ↔ Sibiu ↔ Oradea ↔ Zerind ↔ Arad). Tree-search from Arad blows up; graph-search visits each city once.
The state space (graph)
Four cities, four roads, one cycle.
The search tree from Arad
Truncated at depth 3. The cycle re-enters Arad in two ways. Each re-entry can spawn its own subtree, and so on.
The graph-search fix حلّ البحث في الرسم البياني
Graph-search adds one data structure: a closed listA set of states already expanded. Before expanding any node, check whether its state is in the closed list; if so, discard the node.قائمة الحالات المُغلقة (also called the explored set). Before expanding a node, the algorithm asks "have I already expanded this state?" If yes, the node is discarded.
Three lines added; the algorithm now always terminates on a finite state space, and it never expands the same state twice. The cost is the memory for the closed list, which is bounded by the number of reachable states.
The General Search Algorithm
Every algorithm in this lecture follows the same skeleton. They all start with the initial node, expand nodes one at a time, and stop when they reach a goal. The only thing they vary is which node to expand next, and that single decision changes everything.
The skeleton الهيكل العام
The single line marked above is where all the disagreement happens. To turn this skeleton into a concrete algorithm, we need a rule for picking the next leaf. The natural place to keep the candidate leaves is a data structure called the fringe.
The fringe (frontier) الحافة / المُقدِّمة
The fringeThe collection of nodes that have been generated but not yet expanded. The algorithm picks its next move from here.الحافة / المُقدِّمة is the set of nodes that have been generated but not yet expanded. Different orderings of the fringe yield different algorithms:
- Order by shallowest first (FIFO queue) ⇒ Breadth-First Search
- Order by lowest g(n) first (priority queue on path cost) ⇒ Uniform-Cost Search
- Order by deepest first (LIFO stack) ⇒ Depth-First Search
- Order by lowest h(n) ⇒ Greedy Best-First Search
- Order by lowest g(n) + h(n) ⇒ A* Search
Changing one data structure changes the personality of the algorithm. The rest of this companion is essentially a tour of fringe orderings.
Evaluating a Search Strategy
Before we look at any specific algorithm, we need a vocabulary for judging them. There are four standard criteria. Every algorithm in the next two sections will be evaluated against all four, and you should be able to justify the verdict from how the algorithm works, not memorise it.
1. Completeness الاكتمال
An algorithm is completeAn algorithm is complete if it is guaranteed to find a solution whenever one exists.مكتمل if it is guaranteed to find a solution whenever one exists. "Complete" is not the same as "good": an algorithm can be complete and still take a galactic amount of time.
Two ways an algorithm can fail to be complete:
- It may loop forever on an infinite path without ever reaching a goal.
- It may stop too early, returning failure when a goal does exist somewhere.
2. Optimality الأمثليّة
An algorithm is optimalAn algorithm is optimal if, when it returns a solution, the solution has the lowest possible path cost.أمثلي if, whenever it returns a solution, that solution has the lowest path cost among all solutions. Completeness asks "will you find something?"; optimality asks "will what you find be the best?"
3. Time complexity التعقيد الزمنيّ
The number of nodes generated during the search. We express this in terms of three quantities:
- b, the branching factorThe maximum number of successors any single node has. Worst-case complexity formulas like b^d assume a uniform tree in which every node has b children; the bound is an upper bound on a tree where some nodes branch less.عامل التفرّع, the maximum number of successors any single node has. Some nodes may have fewer; b is the worst case across the tree.
- d, the depth of the shallowest goal node.
- m, the maximum length of any path in the state space (may be infinite).
A uniform tree with branching factor b has exactly bk nodes at depth k. Real trees rarely are uniform, some nodes branch fully, others not at all, so the formula is an upper bound: at most bk nodes at depth k. Either way the count is exponential in d: a million nodes at depth 6 with b = 10. This is why every algorithmic improvement matters.
A related quantity, the effective branching factor b*If an algorithm expands N nodes on a problem of depth d, b* is the branching factor of the uniform tree of depth d that would contain N+1 nodes. b* between 1 and b summarises in a single number how aggressively the algorithm prunes. Closer to 1 means it expands almost only the optimal path.عامل التفرع الفعّال, appears in A*'s complexity analysis (Part Eight). It is the imagined branching factor of a uniform tree that would contain the same number of nodes the algorithm actually expanded, useful for comparing the pruning quality of different heuristics.
4. Space complexity التعقيد المكانيّ
The maximum number of nodes stored in memory at once, essentially, the largest size the fringe ever reaches, plus the closed list in graph-search. Often the binding constraint in practice: many algorithms could afford the time but not the RAM.
Uninformed Search
An uninformed agent (sometimes called blind search) knows the problem formulation and nothing else. It cannot judge which fringe node is closer to the goal, only which is shallowest, deepest, or cheapest in the path so far. Six classical strategies arise from that single constraint. We will use the same example graph for all six so you can compare their behaviour directly.
A weighted graph with start state S and goal state G. Edge labels are step costs. This is the reference graph used throughout this section; the same six algorithms are run on it below.
1. Breadth-First Search · BFS البحث بأولويّة الاتّساع
Trace · step | popped | fringe (front to back)
| Step | Popped | Visited (closed list) | Fringe (front → back) |
|---|
Why BFS has these properties
2. Uniform-Cost Search · UCS البحث بالتكلفة المنتظمة
X(g=5) to mean state X with g-value 5; informed algorithms use the analogous X(h=...) and X(g=..., h=..., f=...) forms.Trace · step | popped | fringe sorted by g(n)
| Step | Popped | Visited (closed list) | Fringe (lowest g first) |
|---|
Why UCS has these properties
3. Depth-First Search · DFS البحث بأولويّة العمق
Trace · step | popped | fringe (top of stack on the right)
| Step | Popped | Visited (closed list) | Fringe (bottom → top) |
|---|
Why DFS has these properties
4. Depth-Limited Search · DLS البحث بالعمق المحدود
When DLS exhausts the fringe without finding a goal, it can return one of two answers: failure (no solution exists within the depth limit) or cutoff (no solution within the limit, but one might exist deeper). The distinction matters for what to try next.
Trace · step | popped at depth | fringe
| Step | Popped | Visited (closed list) | Fringe |
|---|
Why DLS has these properties
5. Iterative Deepening Search · IDS البحث بالتعميق التكراريّ
For b = 10 and d = 5:
- BFS generates roughly 111,111 nodes (1 + 10 + 100 + 1000 + 10000 + 100000).
- IDS generates roughly 123,456 nodes (same nodes regenerated, plus an extra factor for the iterations).
The overhead is about 11%. For larger b the ratio gets better, not worse. And IDS uses only O(b·d) memory, far less than BFS.
Trace · iteration | popped | fringe
| Iter | Popped | Visited (this iter) | Fringe |
|---|
Why IDS has these properties
6. Bidirectional Search البحث ثنائيّ الاتّجاه
The motivation is arithmetic. BFS in one direction generates O(bd) nodes; two BFS searches that meet in the middle generate two sub-trees of depth d/2 each, giving O(bd/2) nodes total. For b = 10, d = 6, this is the difference between a million nodes and two thousand.
Trace · step | direction | both fringes
| Step | Direction | Visited (F & B) | Popped · Fringes |
|---|
Why bidirectional search has these properties
- Computing predecessors is sometimes hard. In chess, given a board, what positions could have led to it? Many, and computing them is its own problem.
- When the goal is described abstractly (such as "checkmate") rather than as a specific state, the backward search has no single starting point.
- Bookkeeping is more complex than for any of the other algorithms here, which makes implementation error-prone.
Informed Search
All six uninformed algorithms suffered from one limitation: they cannot tell which of two fringe nodes is closer to the goal. An informed agent uses a problem-specific function h(n), a heuristic, that estimates the remaining cost from n to a goal. Even a rough estimate can transform the search.
The heuristic function · h(n) دالّة الاستدلال
A heuristicA problem-specific function that estimates the cost from a state to a goal. From the Greek heuriskein, "to find out".دالّة استدلاليّة / إرشاديّة is any function the designer chooses that estimates remaining cost. The art is to choose h so that it is informative (close to the truth) but easy to compute.
For route-finding on the Romania map, the canonical heuristic is the straight-line distanceThe Euclidean distance from a city to the goal, ignoring roads.المسافة المستقيمة from each city to Bucharest:
Notice that h is not the same as the true road distance. From Arad, the straight line to Bucharest is 366 km, but the shortest road route is 418 km. That is normal: heuristics are estimates. What matters is whether they are useful, and whether they are admissible (we will see what this means in a moment).
Best-First Search · the general idea البحث بأولويّة الأفضل
Best-First SearchA family of search algorithms that selects the next node to expand by minimising an evaluation function f(n).البحث بأولويّة الأفضل is the family of algorithms that order the fringe by some evaluation function f(n). Different choices of f give different algorithms:
where g(n) is the cost from the start to n (the same g as in UCS). The next two sub-sections work through each in turn.
1. Greedy Best-First Search البحث الجشِع بأولويّة الأفضل
Trace · step | popped | fringe with h(n)
| Step | Popped | Visited (closed list) | Fringe (lowest h first) |
|---|
Why Greedy Best-First has these properties
The canonical demonstration that Greedy is incomplete. Set start = Iasi, goal = Fagaras. The straight-line distance from Neamt to Fagaras happens to be smaller than from Iasi to Fagaras (Neamt sits roughly between Iasi and Fagaras as the crow flies, even though by road it is a dead-end).
Greedy's behaviour:
- Start at Iasi. Generate neighbours: Vaslui, Neamt.
- Compare heuristics. h(Neamt → Fagaras) < h(Vaslui → Fagaras). Greedy picks Neamt.
- Expand Neamt. Its only neighbour is Iasi (a dead-end town).
- Tree-search: Iasi is regenerated and looks attractive again. Greedy picks Iasi, generates Neamt, picks Neamt, generates Iasi… an infinite loop. Tree-search Greedy is not complete.
- Graph-search: Iasi and Neamt are both closed after one pass. The fringe ran out of unexpanded options for that side of the map; Greedy returns failure or backs up to Vaslui (which is on the fringe), then continues. Graph-search prevents the loop but the algorithm still committed to a wrong side of the map first.
Lesson: a heuristic that points toward a dead-end is enough to derail Greedy. The closed list rescues termination but not optimality, and not the wasted exploration.
2. A* Search خوارزميّة A*
cost so far + estimated cost remaining = estimated total cost
Trace · step | popped | fringe with f = g + h
| Step | Popped | Visited (closed list) | Fringe (lowest f first) |
|---|
Why A* has these properties (assuming h is admissible)
In the Greedy section we saw that ordering the fringe by h alone derails on Iasi → Fagaras because Neamt's h is misleading. A* on the same problem uses f = g + h instead:
- From Iasi, A* generates Neamt and Vaslui. Neamt has a small h but g(Neamt) accumulates 87 km from Iasi. Once g is added the f-value of any path through Neamt is dominated by the round-trip back through Iasi.
- Any path passing through Neamt twice (which is the only way out, since Neamt is a dead-end) has cumulative g ≥ 2 × 87 + cost(Iasi to anywhere else). After Neamt is expanded once and provides no progress, A*'s priority queue prefers Vaslui's branch.
- A* commits to the correct side of the map after only the misleading detour through Neamt is shown not to improve the f-value, and returns the optimal route Iasi → Vaslui → Urziceni → Bucharest → Fagaras.
The g term is what saves A* here. Greedy throws this information away; A* keeps it.
Admissibility · why A* is optimal قابليّة القبول
A heuristic h(n) is admissibleA heuristic h is admissible if h(n) ≤ h*(n) for every node n, never overestimates the true remaining cost.مقبولة (لا تبالغ في التقدير) if, for every node n, h(n) does not overestimate the true cost h*(n) to reach the nearest goal:
Straight-line distance is admissible for route-finding because, by the triangle inequality, no road between two cities can be shorter than the straight line between them. The estimate cannot lie about the destination being too far away; at worst it lies about it being too close.
If h is admissible, then A* using tree-search is optimal.
Proof sketch
Suppose some suboptimal goal G2 has been generated and sits on the fringe. Let n be an unexpanded fringe node on a shortest path to an optimal goal G with cost C*. We will show that A* prefers n to G2, so it will never return G2.
- f(G2) = g(G2) + h(G2) = g(G2) since h(goal) = 0.
- g(G2) > C* since G2 is suboptimal.
- So f(G2) > C*.
- f(n) = g(n) + h(n) ≤ g(n) + h*(n) = C* by admissibility.
- Therefore f(n) ≤ C* < f(G2): A* will expand n before G2.
Since this holds for every suboptimal goal on the fringe, A* must select an optimal goal first.
Dominance الهيمنة
If two admissible heuristics h1 and h2 satisfy h2(n) ≥ h1(n) for every n, then h2 dominates h1. A* with h2 expands no more nodes than A* with h1. For admissible heuristics, bigger is better, closer to the true cost means fewer nodes explored. Russell & Norvig give the 8-puzzle numbers as a vivid illustration:
| Depth d | IDS (uninformed) | A* with h1 (misplaced tiles) | A* with h2 (Manhattan) |
|---|---|---|---|
| 12 | 3,644,035 | 227 | 73 |
| 24 | too many | 39,135 | 1,641 |
At depth 24, IDS becomes infeasible. A* with the weak heuristic explores nearly 40,000 nodes. A* with the dominant heuristic explores only 1,641. The heuristic does the work that brute force cannot.
Where do good heuristics come from? المسائل المُخفّفة
One reliable recipe: relax the problemReplace the real problem with a version that has fewer restrictions on the actions. The cost of an optimal solution to the relaxed problem is an admissible heuristic for the real problem.مسألة مُخفَّفة. Drop a restriction on the actions, compute the optimal cost for the relaxed problem, use that cost as h.
- Relax the 8-puzzle so a tile may move to any square (not just an adjacent empty one): optimal cost is h1 (misplaced tiles).
- Relax it so a tile may move to any adjacent square ignoring whether it is occupied: optimal cost is h2 (Manhattan distance).
Both are admissible because removing constraints can only make a problem easier; an optimal solution to the relaxed problem cannot cost more than an optimal solution to the original.
Pruning in action · a 5×5 maze التهذيب في العمل
The clearest way to see what the heuristic is doing for A* is to put A* next to an uninformed algorithm on the same problem and watch what each one expands. A grid maze makes this visible at a glance.
Rules of the maze
- 5 × 5 grid. S sits in the top-left cell (row 0, column 0); G sits in the bottom-right cell (row 4, column 4).
- Walls (dark navy squares) are blocked. The agent cannot enter, leave through, or pass over them. They are not just unexplored, they are unavailable as states. The walls in rows 1 and 3 split the maze into three vertical lanes joined only through column 2.
- Movement is four-directional, up / down / left / right, between adjacent open cells. No diagonals.
- Every move costs 1. Step cost is uniform. The cost of a path is its number of moves.
- The number printed inside each open cell is h(cell), the heuristic estimate (defined next). S and G are labelled by name instead of by their h-values; h(S) = 8 and h(G) = 0 by definition.
Manhattan distance, the heuristic
Manhattan distanceFor grid cells (r, c) and (r', c'), |r − r'| + |c − c'|. The minimum number of horizontal-plus-vertical steps needed if there were no walls.مسافة مانهاتن from cell (r, c) to cell (r', c') is
that is, the total number of vertical-plus-horizontal moves you would need on an obstacle-free grid. In this maze G is at row 4, column 4, so the heuristic at cell (r, c) is
Worked examples from the grid below:
- h(S) = h(0, 0) = 4 + 4 = 8. (And h*(S) = 8 as well, the true optimal cost; on this maze the heuristic is tight at S.)
- h(2, 1) = (4 − 2) + (4 − 1) = 2 + 3 = 5. But the real shortest cost from (2, 1) to G is 7 (forced detour through column 2). h underestimates by 2.
- h(G) = 0. The goal is always 0 distance from itself.
On a grid where every step costs 1, Manhattan is admissible: walls can only make the real path longer than the obstacle-free count, never shorter, so h(n) ≤ h*(n) always holds.
- Dark navy filled: the start S, or a wall (blocked, unavailable).
- Maroon filled with white letter: the goal G, or the cell the algorithm is currently popping.
- Light maroon: cells already expanded (the closed list).
- Yellow with dashed border: cells on the fringe, generated but not yet popped.
- Gold: cells on the path the algorithm finally returns.
- Plain cream background: cells the algorithm never reached. This is the pruned set, the cells A* refused to expand because the heuristic ruled them out.
Press Step on each side to advance one expansion at a time. Watch which cells each algorithm leaves untouched.
A* with h = Manhattan
BFS, uninformed
- Both find the same optimal cost. The path length is identical because the heuristic is admissible.
- A* never touches the bottom-left pocket. Cells (2, 0), (2, 1), (4, 0), (4, 1) sit at row-column positions whose g + h exceeds C*. A* refuses to expand them because it has a cheaper goal-cost upper bound from its right-lane progress.
- BFS expands them anyway. BFS knows nothing about the goal's location; from its point of view a cell at depth 6 is just as worthy of expansion as any other cell at depth 6, even if Manhattan says that cell is on the wrong side of the maze.
- The difference is the pruning. The cells A* never colours are the cells the heuristic ruled out. Make the heuristic worse and that pruned set shrinks; make it perfect and A* would touch only the path itself.
The complexity of A*, in depth التعقيد بالتفصيل
Time and space costs for A* are not single numbers. They range across a spectrum decided almost entirely by the heuristic. The same A* algorithm can be near-linear on one problem and ruinously exponential on the next.
Time complexity: three cases
A* expands nodes in order of f(n) = g(n) + h(n). What we want to count is how many nodes it expands before the goal is popped. Space tracks the same number because A* stores every generated node. The question is: how does this count grow as problems get harder (deeper solutions, larger branching factor)? The answer depends entirely on how good the heuristic is. Three cases.
Suppose h(n) = h*(n) for every node n. The heuristic returns the exact true remaining cost, with zero error.
Then for every node on an optimal path, f(n) = g(n) + h*(n) = C*; for every node off an optimal path, f(n) > C*. Since A* expands nodes in order of f, and every optimal-path node has f = C* while every off-path node has f > C*, A* expands only the nodes on the optimal path. Exactly d + 1 expansions where d is the solution depth. Linear in d, with no wasted work.
Intuitively: a perfect h is a perfect oracle. At each node it whispers "go this way, not that way" and never lies; A* just follows the breadcrumb trail.
This case is unattainable in practice (if you had a perfect h you would not need to search at all, you would just compute the answer), but it is the theoretical floor.
Suppose h(n) = 0 everywhere. Then f(n) = g(n), and A* degenerates into Uniform-Cost Search. UCS explores in every direction equally, like ripples on a pond, because no fringe node looks more promising than any other.
UCS expands roughly every node with g ≤ C*. In a tree with branching factor b and solution depth d that is about bd nodes. This is the worst case: A* with the worst possible admissible heuristic.
Concretely on Romania: with h = 0, A* would expand outward from Arad equally in every direction, toward Zerind, Timisoara, Sibiu, with no preference, and would explore the entire road network cheaper than 418 km before settling on the answer.
Real heuristics sit between perfect and useless. The question is how much do they prune? There is a precise bound. The number of nodes A* expands is bounded by the size of the set
These are the nodes A* cannot rule out. It must expand each of them before it can be certain that the goal at cost C* is optimal.
The size of this set depends on the heuristic error, the gap between h(n) and h*(n). AIMA introduces the effective branching factor b*If A* expands N nodes on a problem with solution depth d, the effective branching factor b* is the value that makes a uniform tree of depth d and branching factor b* have N+1 nodes. Closer to 1 means the heuristic prunes very well; closer to b means it barely helps.عامل التفرع الفعّال such that A* expands roughly (b*)d nodes.
- Perfect h: b* → 1, near-linear.
- Zero h: b* = b, full exponential.
- Real h: somewhere between.
For the 8-puzzle with h2 (Manhattan), the empirical b* is around 1.4, dramatically smaller than the underlying branching factor b = 3 of unrestricted moves. That is why A* solves 8-puzzles that uninformed search cannot.
Space complexity, and why it bites first
A* keeps every generated node in memory: the priority-queue fringe, the closed list, and the search-tree pointers needed to reconstruct the returned path. With bd generated nodes in the worst case, memory grows exponentially too. The space and time bounds are the same expression.
Here is the practical truth that makes A* hard on large problems: memory runs out before time does. A modern CPU does several million A* expansions per second, but every expanded node carries roughly 50 to 100 bytes of bookkeeping (state representation, parent pointer, g, h, and the priority-queue link). With 8 GB of RAM you can hold around 108 nodes. On a problem with b = 10, d = 12, A* can generate 1013 nodes in the worst case, a hundred thousand times more than fits. The algorithm crashes out of memory long before it would have finished computing.
This is why R&N call A* "memory-bound" rather than "time-bound" for hard problems, and why the memory-bounded variants RBFS and SMA* exist: they trade re-computation for a memory budget you can actually afford. They are introduced in Part Nine.
Exercise · the coordinate graph تمرين تطبيقيّ
An abstract graph with nodes A through K placed on a 2D grid. The straight-line distance from each node to goal K serves as h(n). Step through Greedy and A* in turn and compare the paths and the number of nodes expanded.
Coordinates and h(n) = straight-line distance to K
| Node | (x, y) | h(n) |
|---|
Trace
| Step | Popped | Fringe |
|---|
Exercise · 8-puzzle heuristics استدلالات لعبة الثمانية
For any 8-puzzle state, compute two admissible heuristics:
- h1 = number of tiles in the wrong position (the blank does not count).
- h2 = sum, over all tiles, of the Manhattan distance from the tile's current square to its goal square.
Below: a state with red shading on misplaced tiles and a small Manhattan badge on each tile showing its distance from home. Click any tile adjacent to the blank to slide it and watch the heuristics update in real time.
Current state
Goal
Heuristic values
h1 (misplaced tiles) = ,
h2 (Manhattan) = ,
h2(n) ≥ h1(n) for every state, so h2 dominates h1. A* with h2 will expand fewer nodes.
Memory-Bounded Heuristic Search
A*'s real-world bottleneck is space, not time. The fringe grows exponentially; modern hardware fills its memory long before the algorithm finishes computing. Before we meet the refinements that fix this, it is worth being precise about what A* is storing and why none of it can be discarded.
The problem with A* المشكلة في الذاكرة
What A* keeps, per node
For every state it has ever generated, A* stores a small record. Roughly:
- State: the configuration itself (the 3 × 3 grid for the 8-puzzle, the city name for Romania, the (row, column) cell for a grid maze).
- Parent pointer: which node it was reached from. The returned path is recovered by walking this chain back to the start.
- g(n): the actual path cost from the start to this node.
- h(n): the heuristic estimate from this node to a goal.
- f(n) = g(n) + h(n): the key used to order the priority queue.
Concretely on the 8-puzzle: the state takes about 9 bytes (one nibble per tile), the parent pointer about 8 bytes, three integers about 12 bytes, plus 16 to 32 bytes of priority-queue and hash-table overhead. Roughly 40 to 100 bytes per node, depending on the implementation.
Where the nodes live: two structures
A* keeps these records in two collections that both grow monotonically until the goal is popped. Nothing is freed in between:
- Frontier (the priority queue): every generated-but-not-yet-expanded node, ordered by f(n).
- Reached (the closed list): every state that has already been expanded, kept so repeats can be recognised and skipped.
Why none of this can be thrown away
Students usually ask: cannot A* just discard the bad-looking nodes? For unmodified A* the answer is no, and there are three independent reasons.
- The reached set is needed to avoid re-expansion. Without the closed list, the same state can be reached via a longer path, expanded a second time, its children regenerated, expanded again, and so on. State spaces with loops or many paths to the same configuration explode exponentially when the closed list is absent.
- The frontier is needed for optimality. A* always expands the lowest-f node next. That might currently be node D with f = 14. But after expanding D, every child of D might have f = 20, and the next-best candidate becomes node E with f = 17. If A* had thrown E away to save space, it would never reach E and would miss the path through it. The whole frontier is on the critical path; there is no "obviously useless" subset to drop.
- Parent pointers are needed to return the answer. When A* finally pops the goal, the only way to recover the sequence of actions is to walk the parent chain back to the start. Drop the pointers and A* knows the optimal cost but cannot report the optimal path.
How fast does the memory bill grow?
Each expansion pops one node and pushes up to b children. Memory grows by roughly (b − 1) nodes per expansion. The number of expansions itself grows roughly as (b*)d, where b* is the effective branching factor introduced in Part Eight. For a hard instance of the 15-puzzle with b ≈ 3 and depth around 50, the table looks like:
| Expansions done | Nodes in memory | RAM at ~50 bytes/node |
|---|---|---|
| 102 | ~300 | 15 KB |
| 104 | ~30,000 | 1.5 MB |
| 106 | ~3,000,000 | 150 MB |
| 108 | ~300,000,000 | ~14 GB |
A hard 15-puzzle instance typically requires 107 to 108 A* expansions. The last row exceeds the RAM of a typical laptop. A* swaps to disk, thrashes, and eventually crashes long before it would have finished computing the answer. This is the practical face of the worst-case bound from Part Eight, with concrete numbers attached.
What the variants buy
The two algorithms in this section take different routes to a memory-bounded A*.
- RBFS, next, keeps only the current depth-first path plus the second-best f-value at each ancestor. On the same 15-puzzle instance that is roughly 50 × (3 × 50 bytes) ≈ 7 KB instead of 14 GB. About two million times less memory for the same optimal answer, at the cost of regenerating nodes when the search backtracks.
- SMA*, after that, takes a different route: keep as much of A*'s memory profile as fits in a chosen budget of N nodes, evict the worst when full, regenerate later if needed. The budget is the dial; the answer quality degrades gracefully as it shrinks.
Both rest on the same trick that IDS used for uninformed search: iterate, with the cutoff defined by f-cost instead of depth. They differ in how aggressively they use whatever memory is actually available.
1. Recursive Best-First Search · RBFS البحث التكراريّ بأولويّة الأفضل
RBFS mimics A* but uses linear space, like DFS. It keeps a recursion stack of the current path. At each node, it remembers the second-best f-value among its alternative branches. If the best path under the current branch ever exceeds that second-best, RBFS abandons the branch, backs up, and resumes the alternative.
Why RBFS has these properties
2. Simplified Memory-Bounded A* · SMA* A* بذاكرة محدودة
SMA* uses all available memory, not just linear. It runs A* until memory is full; then it drops the worst leaf (highest f-value) and remembers that subtree's best f-value in the parent. When that subtree later becomes attractive again, SMA* regenerates it.
- Run A* until memory is full. With the memory budget N, store at most N nodes at any time.
- When the budget is exceeded, drop the worst leaf. The "worst" leaf is the one with the highest f-value among the leaves of the current search tree. Before dropping it, record its f-value in its parent so the parent knows "I had a forgotten descendant with f = this".
- Regenerate when the forgotten subtree becomes attractive again. If, later, every node currently in memory has an f-value worse than some ancestor's "forgotten f", SMA* regenerates that ancestor's dropped subtree, expanding it again from scratch. The price of a small memory budget is paid in re-computation.
The "deepest least-f-cost node" rule (line 4 of the pseudocode) breaks ties by preferring nodes that are deeper in the tree. This favours finishing a promising branch over scattering effort across the frontier, which helps when memory is tight.
Why SMA* has these properties
In practice, SMA* is used when an A*-style optimality guarantee is desired but the problem is too large for full A*. For very hard problems, even SMA* may degrade to thrashing between subtrees that all exceed the memory budget, at which point a different approach (such as local search, below) becomes necessary.
Local Search
Every algorithm so far has been a systematic search: it remembers the path, generates a tree, and returns the sequence of actions that solve the problem. For many real problems, scheduling, layout, n-queens, routing, the path is irrelevant. We just want a state that satisfies the goal. Local search abandons the path entirely. It keeps just the current state and tries to improve it.
Problems suited to local search المسائل المناسبة للبحث المحلّيّ
Local search shines on a particular shape of problem. Before we look at the algorithms, three ingredients have to be on the table.
1. A candidate-solution state
A state in local search is a complete candidate answer to the problem, not a partial path being grown. The size of the state is the size of the answer.
- n-queens: an array of n integers, one per column, giving the row of that column's queen. A state is an entire board configuration with all n queens placed.
- Travelling salesperson (TSP): a permutation of the cities. A state is a complete tour.
- Scheduling: an assignment of every task to a time slot and a machine. A state is a full schedule.
- VLSI placement: an x/y position for every component on the chip. A state is a complete layout.
This contrasts sharply with systematic search, where a state is a node on a path from start to goal. In local search the "path" is meaningless; only the candidate at the current step matters.
2. A neighbour function
Given the current state, the neighbour function (or successor function) returns all states reachable by a single small change.
- n-queens: move one queen to a different row in its column. n queens × (n − 1) other rows = n(n−1) neighbours. For n = 8 that is 56.
- TSP: swap two cities in the tour, or reverse a contiguous segment. Each gives O(n2) neighbours.
- Scheduling: move one task to a different slot, or swap two tasks. Many neighbours per state.
The neighbour function defines the topology of the search landscape. A larger neighbourhood gives the algorithm more options at each step, but also makes each step more expensive.
3. An objective function
The objective function (also called the fitness function in evolutionary algorithms, or the cost function when minimising) assigns a numeric quality to every state. Local search uses this number to compare candidates and decide what to do next.
- n-queens (cost view): h(state) = number of pairs of queens attacking each other. Minimise to 0.
- n-queens (fitness view): f(state) = number of non-attacking pairs out of nC2. Maximise to nC2 (28 for n = 8).
- TSP: minimise total tour length.
- Scheduling: minimise makespan, or weighted lateness.
The maximise and minimise views are interchangeable; the algorithms below are written for maximisation (higher VALUE = better), and the 8-queens example minimises h. Mentally flip the sign and the algorithms behave identically.
The state-space landscape منظر فضاء الحالات
Picture the state space as a one-dimensional landscape: the x-axis is the state, the y-axis is an objective function we want to maximise (or, equivalently, a cost we want to minimise; the two views are interchangeable by flipping the sign). Each point is a state; the height is its quality. The shape of the landscape decides how easy or hard the problem is.
Notice that the maxima (the peaks) are curves, smooth high points where the gradient briefly hits zero before reversing. They are not flat. Only the shoulder and the plateau are flat in the diagram. The wording below follows Russell & Norvig and Jarrar's Ch4 notes; the bracket markers under the diagram show the exact extent of each flat region.
- Global maximum القيمة العظمى الشاملة: the highest objective value anywhere in the state space. The destination. Several different states can share this value, but the value itself is unique.
- Local maximum القيمة العظمى المحلّيّة: a state whose objective value is higher than that of every immediate neighbour but lower than the global maximum. Hill climbing stops here because no neighbour is strictly better, even though the algorithm has not reached the true peak.
- Plateau هضبة: a flat region of the landscape where every neighbour has the same objective value. No gradient, no preferred direction. The strict-improvement rule of basic hill climbing terminates immediately. The plateau in the diagram (states 540 to 640) is genuinely flat; the curve does not dip or bump inside it.
- Shoulder كتف: a flat region that does have an exit, one edge of the flat leads to higher terrain. A hill climbing variant that allows sideways moves can cross the shoulder and continue up; the basic strict-improvement variant cannot.
A few notes on what the 1D diagram does not show.
- The valleys between peaks are local minima القيمة الصغرى المحلّيّة. When the problem is phrased as minimising a cost (the 8-queens case below, with h = attacking pairs), maxima and minima swap roles: we are descending the landscape, and the local minimum is what the algorithm gets stuck on.
- A ridge تلّ ضيّق is intrinsically two-dimensional, a long, narrow crest where moving along the crest is uphill, but every single-step neighbour move is sideways or downhill. Single-step hill climbing oscillates and makes no progress along the ridge. 1D illustrations cannot show this; it requires at least two state-space dimensions.
Worked example · 8-queens مسألة الملِكات الثماني
Throughout this section we use the 8-queens problem: place eight queens on a chessboard so that no two attack each other. Local search variants encode the state as eight numbers, one per column, each giving the row of that column's queen. This forces "one queen per column" by construction.
- State: an array of 8 row numbers, one per column. So a state is something like (1, 6, 4, 7, 0, 3, 1, 5).
- Successors: 8 × 7 = 56 neighbour states, obtained by moving one queen to a different row in its column.
- Objective (to minimise): h = number of pairs of queens attacking each other (directly or diagonally). h = 0 is a solution.
The state space has 88 ≈ 17 million states. Systematic search is feasible but slow; local search is much faster.
Reading h on the board
The board below shows one specific 8-queens state. The queens (Q in the dark circles) sit at the columns' current rows. The number printed in every other square is the h-value the state would have if the queen in that column moved to that square. So the 56 numbers represent the 56 possible single-queen moves. Hill climbing simply picks one (or all) of the cells with the smallest number, here highlighted in gold.
Notice that several different moves can yield the same minimum h. The algorithm needs a tie-breaking rule (Russell & Norvig: pick uniformly at random among the best). Notice also that no single move drops h to zero on this board, every smallest-h cell still leaves attacking pairs. Hill climbing must take several moves in sequence, and may not converge to a solution at all.
1. Hill Climbing تسلّق التلّ
Trace · step | move | h before → h after
| Step | Move | h |
|---|
Variants of hill climbing
- Stochastic hill climbing, choose at random among the uphill moves, weighted by steepness. Converges more slowly but finds better solutions on some landscapes.
- First-choice hill climbing, generate successors one at a time at random, take the first one that is better. Useful when there are thousands of successors.
- Random-restart hill climbing, if hill climbing fails, start again from a random state. If each attempt succeeds with probability p, expected number of restarts is 1/p. For 8-queens, p ≈ 0.14, so an average of seven restarts solves it.
Why hill climbing has these properties
2. Simulated Annealing المحاكاة بالصلب
The analogy is from metallurgy: heating a metal and cooling it slowly lets atoms find a low-energy crystal arrangement. The algorithm mimics this: a hot system explores high-energy states; a cool system settles into a minimum.
Theoretical steps
- Initialise. Pick a random starting state. Choose a starting temperature T0 (large enough that most worse moves are accepted at first) and a cooling schedule, typically geometric: T(t + 1) = α T(t) with α close to 1, say 0.95.
- Pick a random neighbour. Unlike hill climbing, SA does not look at all neighbours. It samples one at random.
- Compute ΔE = VALUE(neighbour) − VALUE(current).
- Decide. If ΔE > 0 accept the move (current ← neighbour). If ΔE ≤ 0 accept with probability eΔE/T; otherwise stay.
- Cool. Lower T according to the schedule.
- Repeat until the stopping criterion fires.
Trace · t | T | ΔE | accepted? | h
| t | T | ΔE | ? | h |
|---|
Why simulated annealing has these properties
Simulated annealing was developed at IBM in 1983 (Kirkpatrick et al.). It has been used for VLSI layout, factory scheduling, image processing, and protein folding. Its appeal is universality, the only problem-specific parts are the neighbour function and the objective.
3. Local Beam Search البحث الشعاعي المحليّ
Trace · iter | best h | beam states (h values)
| iter | best | beam (h values) |
|---|
Stochastic beam search, a useful variant. Rather than taking the k best successors deterministically, sample k successors with probability proportional to their value. This avoids the beam collapsing onto a single area too quickly, and the analogy is suggestive: it is essentially asexual reproduction with selection pressure. Which brings us naturally to:
4. Genetic Algorithms الخوارزميّات الجينيّة
- A chromosome is a single candidate solution, encoded as a fixed-length string. It is the GA's analogue of a "state" in hill climbing: one complete answer that can be scored by the fitness function. For 8-queens, the 8-digit string
2 4 4 1 5 1 2 4is one chromosome. - A gene is one position within the chromosome. The 8-queens chromosome has eight genes; gene i holds the row of the queen in column i. Genes are the atomic units that crossover splices between parents and that mutation may flip.
- An allele is a particular value a gene can take. For 8-queens, each gene's possible alleles are the digits in [1, 8] (or [0, 7]). If every member of the population has the same allele at some gene, only mutation can reintroduce a different one.
The five operations of a GA
Every generation runs five operations. Understanding each in isolation is the key to understanding how GAs differ from other local-search methods.
- Representation (encoding). Each candidate solution is encoded as a fixed-length string of symbols, the chromosome. Each position is a gene; each symbol value is an allele. For 8-queens the chromosome is eight digits, each in [1, 8] (or [0, 7]), giving the row of that column's queen. For a problem to suit a GA, an encoding has to exist whose substrings capture meaningful sub-solutions that crossover can preserve.
- Fitness function. Numeric measure of how good a chromosome is. For 8-queens, a natural fitness is the number of non-attacking pairs of queens, in [0, 8C2] = [0, 28]. Fitness 28 means a solution; lower values are worse.
- Selection. Pick pairs of parents from the current population, with selection probability proportional to fitness. The standard formula for individual i with fitness fi in a population of N is
Higher-fitness individuals are more likely to be selected, but lower-fitness ones still have a chance (which preserves population diversity). This scheme is called roulette-wheel selection: imagine a wheel divided into N slices, with each slice's width proportional to that individual's fitness. Spin twice to get two parents.
- Crossover (recombination). Combine two parents into two children. Pick a random crossover point k in [1, L − 1] where L is chromosome length. Child A inherits positions 1..k from parent A and positions k+1..L from parent B; child B is the reverse. The hope: if good "building blocks" of the answer sit in different parents, crossover assembles them.
- Mutation. With small probability (typically 0.01 to 0.05 per gene), randomly change one symbol in a child to keep the population from collapsing to a single area of the search space. Without mutation, lost alleles cannot return.
Worked example: 8-queens with GA
Start with a population of four chromosomes (small for illustration; production GAs typically use 20-500). Each chromosome is a candidate 8-queens board, written as an 8-digit string in which digit i gives the row of the queen in column i.
| Label | Chromosome | Fitness fi | pi = fi / Σf |
|---|---|---|---|
| A | 2 4 4 1 5 1 2 4 | 24 | 24 / 78 = 0.31 (31%) |
| B | 3 2 5 4 3 2 1 3 | 23 | 23 / 78 = 0.29 (29%) |
| C | 2 4 7 4 8 5 5 2 | 20 | 20 / 78 = 0.26 (26%) |
| D | 3 2 7 5 2 4 1 1 | 11 | 11 / 78 = 0.14 (14%) |
| Sum Σf | 78 | 1.00 | |
Selection probabilities form a proper distribution: they sum to 1. A and B are roughly twice as likely to be picked as D, which encodes the GA's "selection pressure" toward better candidates.
Suppose two spins yield the parent pairs (A, C) and (A, B). (A is fittest so most likely to be chosen; that it appears twice is consistent with its 31% slice.)
Pick crossover position k = 3 for pair (A, C). Take the first three digits from one parent and the last five from the other.
| Parent / child | Digits 1..3 | Digits 4..8 | Result |
|---|---|---|---|
| Parent A | 2 4 4 | 1 5 1 2 4 | 2 4 4 1 5 1 2 4 |
| Parent C | 2 4 7 | 4 8 5 5 2 | 2 4 7 4 8 5 5 2 |
| Child 1 (A then C) | 2 4 4 | 4 8 5 5 2 | 2 4 4 4 8 5 5 2 |
| Child 2 (C then A) | 2 4 7 | 1 5 1 2 4 | 2 4 7 1 5 1 2 4 |
The same operation is run on pair (A, B) with its own random crossover point, producing two more children. The four children replace the current population.
Walk through each gene of each child. With mutation probability around 1% per gene, occasionally swap one digit for a random replacement.
Suppose gene 7 of Child 1 mutates from 5 to 7:
Before: 2 4 4 4 8 5 5 2
After: 2 4 4 4 8 5 7 2
Mutation is the only operator that can introduce new alleles. If every member of the population has row 5 in column 2, crossover alone will never produce row 6 in column 2. Mutation can.
The four (mutated) children become the new population. Compute fitness for each, then go back to Step 2. The average fitness typically rises generation after generation; the best individual eventually reaches the target.
Trace · generation | best h | avg h | population
| Gen | Best | Avg | Operations |
|---|
GAs have been applied to circuit design (a famous early result: Adrian Thompson's evolved sound discriminator), antenna design (NASA used a GA-designed antenna on the ST5 spacecraft), schedule optimisation, and many problems where the search space is huge and the structure is poorly understood.
HC vs SA vs GA · at a glance مقارنة سريعة
Once all three local-search algorithms are on the table, comparison questions tend to pivot on a handful of axes. None of the rows below introduces new content, the answers are all derivable from each algorithm's pseudocode, but lining them up side by side makes the contrasts easy to recall.
| Axis | Hill Climbing | Simulated Annealing | Genetic Algorithm |
|---|---|---|---|
| Deterministic? | Yes in its basic form, every step picks the best neighbour (tie-breaking aside) | No, the neighbour is sampled at random, and worse moves are accepted with probability eΔE/T | No, selection, crossover point, and mutation all draw on randomness |
| Can escape local optima? | No, terminates at the first one it finds | Yes, probabilistically, especially while T is still high | Yes, via population diversity, recombination, and mutation |
| Memory at runtime | O(1) — current state only | O(1) — current state plus a temperature | O(N) — the whole population of N individuals |
| Best suited for | Smooth landscapes; or with random restarts as a baseline | Landscapes with many local optima, when neighbour evaluation is cheap | Encodings whose substrings carry meaningful sub-solutions crossover can preserve |
"Determinism" here refers to each algorithm in its basic form. HC has stochastic variants (random restarts, stochastic HC, first-choice HC) that deliberately inject randomness to mitigate the local-maxima problem.
Stopping criteria, side by side شروط التوقّف
One question every local-search algorithm has to answer: when do we stop? Each algorithm uses a different mix of natural and engineered stop conditions, and an exam answer should name them explicitly.
| Algorithm | Natural stop | Engineered safety stops |
|---|---|---|
| Hill Climbing | No neighbour strictly better than the current state. | Max steps; max consecutive sideways moves (for the plateau-tolerant variant). |
| Hill Climbing with random restarts | One of the restarts reaches the target objective value. | Max number of restarts; total step budget across all restarts. |
| Simulated Annealing | Temperature reaches zero (or some ε floor). | Step budget; target objective reached; no improvement for k iterations. |
| Genetic Algorithms | An individual reaches the target fitness. | Max generations; population diversity collapses; best fitness flat for k generations. |
In practice every implementation combines its natural stop with at least one engineered safety stop so the algorithm always terminates in bounded time even when the natural condition is never met.
When to use local search متى نستخدم البحث المحليّ
Local search is the right tool when:
- The path doesn't matter, only the final state.
- The state space is huge, too large for systematic search to enumerate.
- You are willing to accept a non-optimal but good-enough answer.
- You can evaluate a state cheaply (compute its objective quickly).
- Memory is constrained.
Local search is the wrong tool when:
- The sequence of actions is what you need to return (e.g. driving directions).
- The objective is deceptive, the landscape is full of misleading local optima with no signal toward the global one.
- You need a provable optimum.
Culminating Exercise
The point of comparing algorithms is to see them disagree. Below are two problems instrumented to make the disagreement vivid: a small weighted graph that exposes the difference between every uninformed and informed algorithm, and an 8-puzzle that contrasts informed search with local search. Run them all and compare what each one returns.
Part A · The same graph, seven algorithms الجزء أ: نفس الرسم، سبع خوارزميّات
A seven-node weighted graph with start S and goal G. Each edge has a step cost; each node has an admissible heuristic h(n) = straight-line distance to G (you can verify admissibility from the diagram). The seven uninformed and informed algorithms all run on this graph; below the diagram you will see a table of their results side by side.
Click each algorithm to highlight its path on the graph above and see its trace in the panel below. The number of nodes expanded is the count of nodes the algorithm popped from the fringe before terminating (graph-search throughout, FIFO tie-breaking).
| Algorithm | Path returned | Path cost | Nodes expanded | Optimal? |
|---|
Click a row above to see that algorithm's trace
| Step | Popped | Visited | Fringe |
|---|
- BFS finds the shortest path in edges (three: S→A→C→G), but it costs 9, not optimal.
- UCS finds the cheapest path (cost 7) by accumulating g. It explores more nodes than BFS to do so.
- DFS takes whichever branch the successor function visits first. Different orderings give different answers; none is optimal.
- Greedy picks by h, ignoring g. It finds a goal quickly but commits to a suboptimal path.
- A* matches UCS's optimum but typically explores fewer nodes, the heuristic is doing real work.
Part B · The 8-puzzle, informed versus local الجزء ب: لعبة الثمانية
The 8-puzzle is where systematic and local search part ways. A* with a good heuristic is provably optimal but uses exponential memory. Local search uses no memory but gives up optimality. Below: solve the same scrambled puzzle with four different approaches and compare what they return.
Scrambled start
Goal
Four solvers
Each solver is run on the same scrambled start. The 8-puzzle is solvable by every algorithm here, but the four return very different things.
Note: A* solvers cap exploration at 30,000 nodes for browser-runtime safety. Local solvers cap at 1000 steps.
| Algorithm | Solved? | Moves to solve | Nodes / steps used | Notes |
|---|---|---|---|---|
| A* with h1 (misplaced) | Press “Run all four solvers” to populate | |||
- A* with h1 and A* with h2 both return the optimal sequence of moves. h2 uses fewer nodes (it dominates h1).
- Hill climbing usually gets stuck at a local minimum on this objective. It typically does not solve the puzzle, the 8-puzzle's landscape is full of plateaus.
- Random-restart hill climbing solves the puzzle, but does not guarantee an optimal solution path. The "moves to solve" column can be much larger than A*'s.
Comparison & Cheat Sheet
One table to compare every algorithm in this companion. Use it for revision, not to memorise: the entries should be derivable from how each algorithm chooses what to expand next.
| Algorithm | Fringe / structure | Complete? | Optimal? | Time | Space |
|---|---|---|---|---|---|
| BFS | FIFO queue | Yes if b finite | If unit costs | O(bd+1) | O(bd+1) |
| UCS | PQ by g(n) | Yes if costs ≥ ε | Yes | O(b1+⌈C*/ε⌉) | O(b1+⌈C*/ε⌉) |
| DFS | LIFO stack | No in tree-search; Yes in graph-search on finite spaces | No | O(bm) | O(b·m) |
| DLS | LIFO stack, depth cap l | If l ≥ d | No | O(bl) | O(b·l) |
| IDS | Repeated DLS | Yes | If unit costs | O(bd) | O(b·d) |
| Bidirectional | Two FIFO queues | Yes (both BFS) | If unit costs | O(bd/2) | O(bd/2) |
| Informed | |||||
| Greedy BFS | PQ by h(n) | No (can loop) | No | O(bm) | O(bm) |
| A* | PQ by f = g + h | Yes | If h admissible | Exponential worst case | Keeps all generated nodes |
| RBFS | Recursion, with f bound | If h admissible | If h admissible | Often worse than A* | O(b·d) |
| SMA* | PQ, drops worst on overflow | If memory ≥ d | If memory enough | Depends on budget | Bounded by budget |
| Local | |||||
| Hill climbing | Current state only | No (local maxima) | No | Fast in practice; unbounded in worst case | O(1) |
| Simulated annealing | Current state + temp | Probabilistic, slow schedule | Probabilistic | Fixed by schedule | O(1) |
| Local beam | k states at once | No | No | k per step | O(k) |
| Genetic | Population + ops | No | No | Population size × generations | O(N) population |
- No heuristic, unit costs, shallow goal: BFS if memory permits, else IDS.
- No heuristic, varying costs: UCS.
- Heuristic available, optimality desired, memory permits: A*.
- Heuristic available, optimality desired, memory tight: RBFS or SMA*.
- Path doesn't matter, just need a state: Local search. Hill climbing with restarts as baseline; simulated annealing for landscape with many local minima; GA when state representation has sub-structure.
Glossary
State
A configuration of the world relevant to the problem. Same configuration via different paths is the same state.
Node
Bookkeeping record in the search tree containing a state, parent pointer, action, path-cost, and depth.
Successor function
Given a state, returns the set of <action, resulting state> pairs reachable in one step. Implicitly defines the state space and the search tree.
Goal test
A function that, given a state, returns true if it is a goal. Can be explicit (specific state) or a property (e.g. checkmate).
Path cost · g(n)
The cost of the path from the start to node n. Usually a sum of non-negative step costs. UCS and A* order the fringe by g (and g + h).
Optimal solution
A solution whose path cost is the lowest among all solutions to the problem. Distinct from "a solution" of which there may be many.
Generated
A node is generated the moment the successor function produces it. The total count of generated nodes is what time-complexity formulas count.
Reached
A state is reached if a node for that state has been generated. In graph-search the reached set blocks duplicate generation; in tree-search there is no such check.
Expanded (visited)
A node is expanded when its successor function is applied. The number of expansions is the main efficiency metric.
Fringe / frontier / open list
The collection of generated but unexpanded nodes. Different orderings give different algorithms.
Closed list / reached set
The set of already-expanded nodes. Used by graph-search to recognise repeats and avoid re-expansion.
Search strategy
The rule by which an algorithm picks which fringe node to expand next. Every algorithm in this companion is defined by its strategy.
Branching factor · b
Maximum number of successors any single node has. Formulas like bd assume a uniform tree as the worst-case upper bound.
Effective branching factor · b*
The b that a uniform tree of depth d would need to contain the same number of nodes the algorithm actually expanded. Closer to 1 means the algorithm prunes well.
Depth · d, m
d = depth of the shallowest goal; m = maximum depth of any path in the state space (may be infinite).
Heuristic · h(n)
Estimate of cost from n to the nearest goal. Problem-specific; chosen by the designer.
Evaluation function · f(n)
The key used to order the fringe in best-first search. Greedy uses f = h; A* uses f = g + h.
Admissible
h(n) ≤ h*(n) for every node. Admissible heuristics never overestimate the true remaining cost. Required for A*'s tree-search optimality.
Consistent (monotonic)
h satisfies the triangle inequality: h(n) ≤ c(n, a, n') + h(n'). Consistency implies admissibility. With a consistent h, A*'s graph-search is optimal too.
Dominance
h2 dominates h1 if h2(n) ≥ h1(n) everywhere, both admissible. A* with h2 expands no more nodes than A* with h1.
Relaxed problem
A problem with fewer restrictions on the actions. The optimal cost of a relaxed problem is an admissible heuristic for the original.
Pathmax
Adjustment used by RBFS to ensure f-values are non-decreasing along any path: f(child) ← max(f(parent), g(child) + h(child)).
Local maximum
A state whose value is higher than every immediate neighbour but lower than the global maximum. Hill climbing terminates here.
Plateau
Flat region of the state-space landscape where every neighbour has the same objective value. No gradient signal.
Shoulder
Flat region with an exit, one edge of the flat leads to higher terrain. Escapable by sideways moves.
Ridge
A long narrow crest in ≥ 2D: moving along the crest is uphill, but every single-step move is sideways or downhill. Single-step hill climbing oscillates.
Sideways move
A neighbour move with the same objective value as the current state. Allowing a bounded number of sideways moves lets hill climbing cross shoulders.
Cooling schedule
Function T(t) controlling how the temperature decreases over time in simulated annealing. Slow logarithmic schedules guarantee convergence; fast geometric schedules are typical.
Chromosome
A single candidate solution in a GA, encoded as a fixed-length string of symbols. The GA's analogue of a "state" in hill climbing.
Gene
One position within a chromosome. Crossover splices genes between parents; mutation flips one at random with small probability.
Fitness function
Objective in evolutionary algorithms. For 8-queens, the number of non-attacking pairs of queens. Genetic Algorithms maximise this.
Crossover
GA operation: pick a random split point in two parent chromosomes; child A inherits the first half of one parent and the second half of the other. Two children per crossover.
Mutation
GA operation: with small probability per gene, randomly change a symbol in a child. The only operator that can introduce alleles not present in any parent.