COMP338, Problem Solving by Search: An Interactive Companion

Part One

The Problem-Solving Agent

why we need search at all

Imagine you are on holiday in Romania, currently in Arad, and your flight leaves tomorrow from Bucharest. The towns are linked by a network of roads. How do you decide which road to take from Arad? You cannot see Bucharest from Arad. You need a plan.

This is the situation of a problem-solving agentA goal-based agent that decides what to do by finding sequences of actions that lead to desirable states.عَميل حلّ المشكلات: an agent that decides what to do by considering future actions and their outcomes. It is one specific kind of goal-based agent. The problem-solving agent operates in four phases, in order:

1. Goal formulation

صياغة الهدف

The agent decides what counts as success. "Be in Bucharest tomorrow" becomes a goal state. Goals organise behaviour by ruling out actions that don't help.

2. Problem formulation

صياغة المسألة

Given the goal, the agent decides what actions and states to consider. This abstraction step is what makes search possible.

3. Search

البحث

Examine sequences of actions that lead to states of known value, then choose the best sequence. This produces a plan.

4. Execution

التنفيذ

Carry out the actions of the chosen plan, one by one. If the environment is deterministic and observable, no further sensing is needed during execution.

KeySearch is the part of the cycle where the agent thinks. Everything before it sets the question; everything after it carries out the answer. Get the question wrong and even the cleverest search returns the wrong plan.

Where search shows up in practice

Even today, when neural networks dominate AI, the problem-solving paradigm is everywhere. Satnav uses A* on the road graph. Logistics planners (Amazon, Aramex, BRT) use search to route packages. Game engines (chess, Go, even AlphaGo's tree-search backbone) use it to look ahead. Compiler optimisation, theorem proving, robot motion planning, protein folding, all are search problems wearing different clothes. Understanding the basics here is the foundation for everything else in this course.

Part Two

Problem Formulation

turning a real situation into a search problem

A search problem must be expressed in a form the algorithm can work on. Russell and Norvig give a six-component schema that captures every search problem in this course, from finding a route to playing chess to solving the 8-puzzle.

The Six Components المُكوِّنات الستة

1

State

الحالة

A configuration of the world that is relevant to the problem. In route-finding, a state is a city. In the 8-puzzle, it is the positions of the tiles plus the blank. The choice of what counts as a state is the heart of formulation: too much detail and the space explodes; too little and the answer is meaningless.

2

Initial State

الحالة الابتدائية

The state in which the agent starts. The search begins here. For our running example: In(Arad).

3

Successor Function

دالّة الخلَف

Given a state x, returns the set of <action, successor> pairs reachable in one step. This implicitly defines the legal moves of the problem and, together with the initial state, defines the state spaceThe set of all states reachable from the initial state by any sequence of actions.فضاء الحالات.

4

Goal Test

اختبار الهدف

A function that, applied to a state, returns true if the state is a goal. The test can be explicit (a specific state, such as In(Bucharest)) or a property (such as checkmate in chess).

5

Path Cost

تكلفة المسار

A numeric cost assigned to a path. Usually it is the sum of step costs c(x, a, y) along the path. We assume each step cost is non-negative. For Romania, the step cost is the road distance in kilometres.

6

Solution

الحلّ

A sequence of actions that leads from the initial state to a goal state. An optimal solution is one whose path cost is the lowest among all solutions.

Worked Example · Romania مثال: التنقّل في رومانيا

The agent is in Arad, the goal is to reach Bucharest. The state space is the set of cities; the successor function returns neighbouring cities reachable by a labelled road; the goal test asks In(Bucharest)?; the path cost is the sum of road distances. Click any city below to see its <action, state> successor pairs.

Click a city

Click any city above to see its successor set, formatted in the <action, state> notation used in Russell & Norvig.

Worked Example · The 8-Puzzle مثال: لعبة الثمانية

A 3×3 board with eight numbered tiles and one blank square. The blank may move up, down, left or right (equivalently: an adjacent tile slides into the blank). The goal is a specific target configuration.

Start State

7

2

4

5

6

8

3

1

Goal State

1

2

3

4

5

6

7

8

Formulation

State: location of the eight tiles and the blank.
Initial state: any configuration.
Successor function: apply one of four blank moves (Up, Down, Left, Right). Returns the new state.
Goal test: does the state match the goal?
Path cost: 1 per move (each move counts equally).

There are 9!/2 = 181,440 reachable states. Brute force is feasible here, but the 15-puzzle has ≈ 10¹³ states, and the 24-puzzle ≈ 10²⁵: heuristics become essential.

Real-world problem formulations أمثلة عمليّة من الواقع

Romania and the 8-puzzle are textbook examples chosen to fit on a page. The same six-component schema scales to genuinely industrial problems. Four short formulations follow, drawn from Jarrar's Ch3 pages 10-15.

Travelling Salesperson (TSP)

مسألة البائع المتجوّل

Given a set of cities and pairwise distances, find the shortest tour that visits every city exactly once and returns to the start.

State: current city, plus the set of cities already visited.
Initial state: the starting city, no other city visited.
Successors: move to any city not yet visited.
Goal test: all cities visited AND the agent is back at the starting city.
Path cost: total tour length (sum of inter-city distances).

VLSI layout

تصميم الدوائر المتكاملة

Place electronic components on a chip and route wires between them. Each chip has billions of transistors; the layout has dramatic effects on speed, power, and yield.

State: positions of components and wires on the chip.
Initial state: either no components placed (incremental) or all components placed at random (complete-state).
Successors: place a component, route a wire (incremental); or move/swap components and re-route (complete-state).
Goal test: all components placed and connections routed without overlap, meeting timing and area constraints.
Path cost: some combination of total wire length, congestion, and number of layer changes.

Robot navigation

ملاحة الإنسان الآليّ

A mobile robot must move from a start configuration to a target configuration in an environment that may include obstacles and other moving agents.

State: robot's position (and orientation), plus the positions of its actuators.
Initial state: the robot's starting configuration (task-dependent).
Successors: apply a movement or an actuator action.
Goal test: the robot has reached the target configuration (task-dependent).
Path cost: often a weighted sum of distance travelled, energy consumed, and time taken.

Automatic assembly

التجميع الآليّ

A robot assembles a product from parts, one component at a time, respecting physical constraints (a part can only be attached when its mating surfaces are accessible).

State: which components have been placed and where.
Initial state: no components placed.
Successors: place the next compatible component in a feasible position.
Goal test: the product is fully assembled.
Path cost: number of moves, or weighted assembly time.

The same problems can be tackled by very different algorithms depending on their formulation. TSP and VLSI are typically attacked by local search (Part Ten) because the state space is gigantic and the path of intermediate states is irrelevant: we just want a good final assignment. Route finding and assembly sequencing, where the sequence of actions matters, lean on systematic search.

AsideA good formulation removes detail that does not affect the search and keeps only what is needed to make decisions. The art of problem solving begins here, before any algorithm runs.

Part Three

The Search Tree

what the algorithm constructs as it explores

Once we have a formulation, the algorithm starts from the initial state and applies the successor function repeatedly, building a tree of partial plans. This is the search tree. Each branch represents an action; each node represents a state reached by a particular sequence of actions.

States versus Nodes الحالات مقابل العقد

A common source of confusion: a stateA configuration of the world. Two different paths to the same configuration share the same state.الحالة is a configuration of the world, while a nodeA bookkeeping record in the search tree, containing a state plus the parent, the action that produced it, the depth, and the path cost.العقدة is a bookkeeping record in the search tree. A node has five fields:

STATE, the state this node corresponds to
PARENT-NODE, the node that generated this one
ACTION, the action applied to the parent's state to produce this node
PATH-COST, written g(n), the cost from the root to this node
DEPTH, the number of steps from the root

Two different nodes may correspond to the same state if there are two paths to it from the start. The state space is a graph; the search tree the algorithm builds may visit the same state through different branches. This is the source of all the trouble we are about to discuss.

Expansion توسيع العقدة

The algorithm grows the tree by expandingApplying the successor function to a node to generate its children, which are then added to the frontier.توسيع a node: it applies the successor function to that node's state, generates one new node per resulting <action, state> pair, and adds those new nodes to the collection of unexplored options. This collection is called the frontier or fringe.

KeyExpansion is the only way the search tree ever grows. Every algorithm we will see expands one node per step. They differ only in which node they choose to expand next.

Search terminology, in one place مفردات البحث

These terms recur in every algorithm below; pinning them down once avoids confusion later. The same node passes through three sets in its lifetime: it is first generated, then reached, then expanded.

Generated

مُولَّد

A node has been generated the moment the successor function produces it. The total count of generated nodes is what the time-complexity formulas count.

Reached

مُكتشف

A state is reached if a node for that state has been generated and placed somewhere we can find it (fringe or closed list). In graph-search, we generate a state at most once because we check against the reached set.

Expanded (visited)

مُوسَّع / مُزار

A node is expanded when its successor function is applied. Synonymously called visited. The number of expansions is the main metric algorithms compare on. After expansion, the node moves from the fringe to the closed list.

Fringe (frontier, open list)

الحافة / المُقدِّمة

The collection of nodes that have been generated but not yet expanded. The algorithm's choice of which fringe node to expand next is what makes each algorithm distinct (FIFO for BFS, LIFO for DFS, priority queue for UCS, A*, etc).

Closed list (reached set)

قائمة المُغلَقة

The set of nodes that have already been expanded. Used by graph-search to recognise when we are about to re-process a state and skip it. Tree-search has no closed list and may revisit states.

Search strategy

استراتيجيّة البحث

The rule by which an algorithm picks the next node to expand from the fringe. Different rules give different algorithms; all of them use the same expansion machinery.

In the interactive demos below, the trace tables show Step, Popped (the node just expanded), Visited (the closed list at that point), and Fringe (what is still queued). Track these columns and the abstract terminology becomes concrete.

Part Four

Tree-Search versus Graph-Search

why the closed list exists

The state space is almost always a graph: many states have more than one path leading to them, and some have edges that lead back to where you came from. The search tree the algorithm constructs is a tree: each node has exactly one parent. When the graph has cycles, building a tree by naive expansion can produce a tree that is exponentially larger than the graph, or even infinite. The fix is called graph-search: keep a list of states already expanded and refuse to expand them again.

Below are three worked examples. Each shows the same state space drawn as both a graph and as the search tree that tree-search would explore. The contrast makes the issue concrete.

Example 1 · A simple cycle المثال الأول: حلقة بسيطة

The smallest interesting state space: two cities, A and B, connected by a single bidirectional road. Start at A, goal is unreachable. The graph has just two nodes; the search tree, without a closed list, grows forever.

The state space (graph)

فضاء الحالات (رسم بياني)

Two states. Each successor of A is B; each successor of B is A. A finite graph.

The search tree (tree-search)

شجرة البحث بدون قائمة الإغلاق

Tree-search has no memory of where it has been. A's child is B; B's child is A; that A's child is B; on forever.

ConsequenceTree-search on this graph would expand A, B, A, B, A, B, … without ever stopping. Graph-search expands A, then B (sees A is closed, refuses to push it again), then the fringe is empty and the search terminates with failure.

Example 2 · The diamond المثال الثاني: المعيّن

Start at S, goal at G, with two parallel paths through M1 and M2. The graph has four nodes; the search tree tree-search builds has the goal G appearing twice, once from each path.

The state space (graph)

فضاء الحالات

Four states. Two paths to G, of equal or different cost.

The search tree

شجرة البحث

G appears twice: once as a child of M1, once as a child of M2. With cycles ignored, each instance is a separate node even though it is the same state.

This is wasteful but not infinite. The real problem comes when this duplication happens recursively, deeper in the tree. With d levels of diamonds stacked, the tree has 2^d goal nodes; the graph still has just one.

Why it mattersEven when tree-search terminates, it can do exponentially more work than the graph contains. Every shared state below the duplicate is also duplicated, doubling the work at every level.

Example 3 · A Romania sub-graph المثال الثالث: جزء من خريطة رومانيا

A more realistic case: the four cities around Arad in the north-west of Romania. The roads form a graph with cycles (Arad ↔ Sibiu ↔ Oradea ↔ Zerind ↔ Arad). Tree-search from Arad blows up; graph-search visits each city once.

The state space (graph)

رومانيا الشمالية الغربية

Four cities, four roads, one cycle.

The search tree from Arad

شجرة البحث من أراد

Truncated at depth 3. The cycle re-enters Arad in two ways. Each re-entry can spawn its own subtree, and so on.

The graph-search fix حلّ البحث في الرسم البياني

Graph-search adds one data structure: a closed listA set of states already expanded. Before expanding any node, check whether its state is in the closed list; if so, discard the node.قائمة الحالات المُغلقة (also called the explored set). Before expanding a node, the algorithm asks "have I already expanded this state?" If yes, the node is discarded.

function GRAPH-SEARCH(problem, strategy) returns a solution, or failure initialise the fringe with INITIAL-STATE closed ← empty set loop do if fringe is empty then return failure node ← SELECT-LEAF(fringe, strategy) if GOAL-TEST(node.STATE) then return SOLUTION(node) if node.STATE is not in closed then add node.STATE to closed add EXPAND(node) to fringe end

Three lines added; the algorithm now always terminates on a finite state space, and it never expands the same state twice. The cost is the memory for the closed list, which is bounded by the number of reachable states.

When to use whichTree-search is mainly a teaching device, and it is acceptable for genuinely tree-shaped problems where states never repeat. Graph-search is what real systems use whenever the state space has cycles or shared sub-paths, which is almost always. Every interactive demo in this companion defaults to graph-search; you can switch to tree-search with the toggle to see what would happen.

Part Five

The General Search Algorithm

one skeleton, many strategies

Every algorithm in this lecture follows the same skeleton. They all start with the initial node, expand nodes one at a time, and stop when they reach a goal. The only thing they vary is which node to expand next, and that single decision changes everything.

The skeleton الهيكل العام

function GENERAL-SEARCH(problem, strategy) returns a solution, or failure initialise the search tree using the initial state of problem loop do if there are no candidates for expansion then return failure choose a leaf node for expansion // this line is where strategies differ if the node contains a goal state then return the corresponding solution else expand the node and add the resulting nodes to the search tree end

The single line marked above is where all the disagreement happens. To turn this skeleton into a concrete algorithm, we need a rule for picking the next leaf. The natural place to keep the candidate leaves is a data structure called the fringe.

The fringe (frontier) الحافة / المُقدِّمة

The fringeThe collection of nodes that have been generated but not yet expanded. The algorithm picks its next move from here.الحافة / المُقدِّمة is the set of nodes that have been generated but not yet expanded. Different orderings of the fringe yield different algorithms:

Order by shallowest first (FIFO queue) ⇒ Breadth-First Search
Order by lowest g(n) first (priority queue on path cost) ⇒ Uniform-Cost Search
Order by deepest first (LIFO stack) ⇒ Depth-First Search
Order by lowest h(n) ⇒ Greedy Best-First Search
Order by lowest g(n) + h(n) ⇒ A* Search

Changing one data structure changes the personality of the algorithm. The rest of this companion is essentially a tour of fringe orderings.

Part Six

Evaluating a Search Strategy

four questions every algorithm must answer

Before we look at any specific algorithm, we need a vocabulary for judging them. There are four standard criteria. Every algorithm in the next two sections will be evaluated against all four, and you should be able to justify the verdict from how the algorithm works, not memorise it.

1. Completeness الاكتمال

An algorithm is completeAn algorithm is complete if it is guaranteed to find a solution whenever one exists.مكتمل if it is guaranteed to find a solution whenever one exists. "Complete" is not the same as "good": an algorithm can be complete and still take a galactic amount of time.

Two ways an algorithm can fail to be complete:

It may loop forever on an infinite path without ever reaching a goal.
It may stop too early, returning failure when a goal does exist somewhere.

2. Optimality الأمثليّة

An algorithm is optimalAn algorithm is optimal if, when it returns a solution, the solution has the lowest possible path cost.أمثلي if, whenever it returns a solution, that solution has the lowest path cost among all solutions. Completeness asks "will you find something?"; optimality asks "will what you find be the best?"

3. Time complexity التعقيد الزمنيّ

The number of nodes generated during the search. We express this in terms of three quantities:

b, the branching factorThe maximum number of successors any single node has. Worst-case complexity formulas like b^d assume a uniform tree in which every node has b children; the bound is an upper bound on a tree where some nodes branch less.عامل التفرّع, the maximum number of successors any single node has. Some nodes may have fewer; b is the worst case across the tree.
d, the depth of the shallowest goal node.
m, the maximum length of any path in the state space (may be infinite).

A uniform tree with branching factor b has exactly b^k nodes at depth k. Real trees rarely are uniform, some nodes branch fully, others not at all, so the formula is an upper bound: at most b^k nodes at depth k. Either way the count is exponential in d: a million nodes at depth 6 with b = 10. This is why every algorithmic improvement matters.

A related quantity, the effective branching factor b*If an algorithm expands N nodes on a problem of depth d, b* is the branching factor of the uniform tree of depth d that would contain N+1 nodes. b* between 1 and b summarises in a single number how aggressively the algorithm prunes. Closer to 1 means it expands almost only the optimal path.عامل التفرع الفعّال, appears in A*'s complexity analysis (Part Eight). It is the imagined branching factor of a uniform tree that would contain the same number of nodes the algorithm actually expanded, useful for comparing the pruning quality of different heuristics.

4. Space complexity التعقيد المكانيّ

The maximum number of nodes stored in memory at once, essentially, the largest size the fringe ever reaches, plus the closed list in graph-search. Often the binding constraint in practice: many algorithms could afford the time but not the RAM.

CarefulTime complexity is not what most students expect. It counts nodes generated, not seconds. Whether your machine takes 1 ms or 1 hour per node is a separate question. The complexity formulas tell you how the algorithm scales, not how long a particular run takes.

Part Seven

Uninformed Search

strategies that know only the problem definition

An uninformed agent (sometimes called blind search) knows the problem formulation and nothing else. It cannot judge which fringe node is closer to the goal, only which is shallowest, deepest, or cheapest in the path so far. Six classical strategies arise from that single constraint. We will use the same example graph for all six so you can compare their behaviour directly.

The example graph

A weighted graph with start state S and goal state G. Edge labels are step costs. This is the reference graph used throughout this section; the same six algorithms are run on it below.

Start

Goal

Expanded (popped & explored)

Just popped

On the fringe (generated, not yet popped)

How to useEach algorithm has three buttons: Reset rebuilds the search from scratch; Step advances one expansion at a time and shows in plain English what just happened; Play All runs the search to completion. Read the feedback panel after every step, that is where the narrative is.

1. Breadth-First Search · BFS البحث بأولويّة الاتّساع

Idea

Expand the shallowest unexpanded node first. Explore the tree level by level: all nodes at depth 0, then all at depth 1, then all at depth 2, and so on.

Fringe data structure: FIFO queue (first-in, first-out), nodes leave in the order they entered.

function BFS(problem) returns a solution, or failure fringe ← FIFO-queue containing only MAKE-NODE(INITIAL-STATE) loop do if fringe is empty then return failure node ← REMOVE-FRONT(fringe) if GOAL-TEST(STATE(node)) then return SOLUTION(node) fringe ← INSERT-ALL(EXPAND(node), fringe) // new nodes go to the back

Graph-search Quiz mode

Press Step to begin. The search starts with only the initial state S on the fringe. Each Step you press will pop the front of the queue and expand it.

Trace · step | popped | fringe (front to back)

Step	Popped	Visited (closed list)	Fringe (front → back)

Why BFS has these properties

Complete

Yes, provided b is finite

BFS explores level by level. If a goal exists at any finite depth d, BFS reaches depth d after generating every node at depths 0, 1, …, d. Since each level is finite (because b is finite), this terminates.

Optimal

Only when step cost is the same for all actions

BFS returns the shallowest goal. If every action costs the same, shallowest equals cheapest, so optimal. If costs differ, a deeper but cheaper path can exist, and BFS will miss it. BFS is optimal in number of steps, not in cost.

Time

O(b^d+1)

In the worst case the goal is the last node generated at depth d. By that point BFS has generated every node at depths 0 through d, plus the children of every node at depth d (b^d+1 nodes). Sum: 1 + b + b² + … + b^d + (b^d+1 − b) = O(b^d+1).

Space

O(b^d+1)

Every generated node must stay in memory: the fringe holds every node at the deepest current level, and the search tree keeps the rest for path reconstruction. Space matches time. This is BFS's real downfall: with b = 10, d = 12, a million nodes per second, BFS takes 13 days but needs 1 PB of RAM.

TakeawayBFS is intuitive and complete, but the exponential space requirement makes it impractical for any problem with non-trivial depth. Space, not time, is what kills BFS in practice.

2. Uniform-Cost Search · UCS البحث بالتكلفة المنتظمة

Idea

BFS finds the shallowest goal. UCS fixes the cost problem: expand the node with the lowest accumulated path cost g(n) first. When step costs vary, this is what we usually want.

Fringe data structure: priority queue ordered by g(n). When step costs are all equal, UCS degenerates into BFS.

Notationg(n) is the cost of the path from the start state to n, accumulated by summing step costs along the way. UCS is the first algorithm in this companion that orders the fringe by a numeric quantity, so this is where the notation appears. Later, in Part Eight, we will add h(n), an estimate of the remaining cost from n to a goal, and f(n) = g(n) + h(n). In the traces below, fringe entries are written X(g=5) to mean state X with g-value 5; informed algorithms use the analogous X(h=...) and X(g=..., h=..., f=...) forms.

CriticalThe goal test fires on pop, not on generation. A node is checked for goal-hood when it is removed from the fringe, not when it is first generated. Why? Because a cheaper path to the same goal might still be sitting on the fringe. Testing on generation would commit too early and break optimality.

UCS and loopsEven without a closed list, UCS will not get stuck in a loop when step costs are positive. A cycling path's cumulative g(n) keeps growing, so the priority queue eventually pops a non-cycling path first, the loop is "out-priced" by its own accumulating cost. The closed list (graph-search, §VI) is what prevents wasted re-expansion of states reached by a worse path; it is not what prevents the infinite loop in the first place.

Graph-search Quiz mode

Press Step to begin. Watch the fringe reorder itself after each expansion. UCS may leave a goal node sitting on the fringe rather than returning it immediately, because a cheaper path might still be coming.

Trace · step | popped | fringe sorted by g(n)

Step	Popped	Visited (closed list)	Fringe (lowest g first)

Why UCS has these properties

Complete

Yes, provided every step cost is at least some ε > 0

If step costs can shrink toward zero, UCS could explore an infinite sequence of ever-cheaper actions and never reach a goal. With a positive lower bound on step costs, the cost of any path of length k is at least k·ε, so UCS reaches any finite-depth goal in finite time.

Optimal

Yes

UCS pops nodes in order of increasing g. When it pops a goal, no fringe node has lower g, so no cheaper path to any goal exists. This is precisely why goal-test on pop matters.

Time

O(b^{1+⌈C*/ε⌉})

Worst case: UCS expands every node with path cost ≤ C* (the optimal cost). The deepest such path has length ⌈C*/ε⌉ steps. So it can be much worse than BFS when step costs are small relative to C*.

Space

O(b^{1+⌈C*/ε⌉})

Same reasoning as time. Every node with cost up to C* sits on or above the fringe.

3. Depth-First Search · DFS البحث بأولويّة العمق

Idea

Expand the deepest unexpanded node first. Plunge into one branch as far as it goes; only when you hit a dead end (or a goal) do you back up and try another branch.

Fringe data structure: LIFO stack, last-in, first-out.

function DFS(problem) returns a solution, or failure fringe ← LIFO-stack containing only MAKE-NODE(INITIAL-STATE) loop do if fringe is empty then return failure node ← POP(fringe) // most recently added node if GOAL-TEST(STATE(node)) then return SOLUTION(node) push EXPAND(node) onto fringe

Graph-search Quiz mode

Press Step to begin. DFS plunges down a single branch until it hits a dead end, then backtracks. Watch how it can return a long path even when a much shorter one exists.

Trace · step | popped | fringe (top of stack on the right)

Step	Popped	Visited (closed list)	Fringe (bottom → top)

Why DFS has these properties

Complete

No in tree-search · Yes in graph-search on a finite state space

Tree-search DFS can loop forever down an infinite path (or repeatedly visit the same state via different paths). Graph-search DFS, which discards repeated states, terminates on any finite state space because there are only finitely many states to visit.

Optimal

No

DFS returns the first goal it stumbles upon, which is whichever happens to lie down the branch it explores first. There is no preference for short paths or cheap ones, order of expansion is determined entirely by how the successor function lists children.

Time

O(b^m)

In the worst case DFS explores the entire tree to its maximum depth m. If m is much larger than d (the depth of the shallowest goal), DFS can take vastly longer than BFS. If m is infinite, it never terminates.

Space

O(b·m), DFS's one great strength

DFS only needs to remember the path from the root to the current node (m nodes), plus the unexpanded siblings of each ancestor (at most b at each level). This is linear in depth, exponentially smaller than BFS or UCS.

TakeawayDFS trades guarantees for memory. When memory is the binding constraint and the problem is finite, DFS with graph-search becomes very attractive. Linear space, complete on a finite graph, but no optimality.

4. Depth-Limited Search · DLS البحث بالعمق المحدود

Idea

DFS's incompleteness comes from infinite descent. Fix it by imposing a depth limit l: nodes at depth l are treated as having no successors. The algorithm explores down to depth l, then gives up on that branch.

Fringe data structure: LIFO stack with a depth cap.

When DLS exhausts the fringe without finding a goal, it can return one of two answers: failure (no solution exists within the depth limit) or cutoff (no solution within the limit, but one might exist deeper). The distinction matters for what to try next.

Depth limit l: Graph-search

Try a small limit first. The shallowest goal in the reference graph is at depth 4. With l = 1, 2, or 3 DLS will report cutoff. Set the limit to 4 or higher to see DLS find the goal, but note that it may not be the cheapest path.

Trace · step | popped at depth | fringe

Step	Popped	Visited (closed list)	Fringe

Why DLS has these properties

Complete

Only when l ≥ d

If the shallowest goal sits below the depth limit, DLS will never see it. With l smaller than d, completeness fails. Choosing l is hard precisely because d is usually unknown.

Optimal

No

DLS is just bounded DFS. It returns the first goal within depth l, which need not be the shallowest goal even when l ≥ d.

Time

O(b^l)

Worst case: DLS explores every node in the tree down to depth l. The geometric series sums to (b^l+1−1)/(b−1) = O(b^l).

Space

O(b·l)

Same linear-in-depth property as DFS, capped at l. DLS keeps DFS's space advantage.

5. Iterative Deepening Search · IDS البحث بالتعميق التكراريّ

Idea

Run depth-limited search with l = 0, then l = 1, then l = 2, and so on, until a goal is found. IDS combines DFS's linear space with BFS's completeness and shallowest-goal guarantee.

Fringe data structure: LIFO stack · outer loop increases the depth limit.

function IDS(problem) returns a solution, or failure for depth = 0 to ∞ do result ← DLS(problem, depth) if result ≠ cutoff then return result

SurpriseAt first glance, IDS looks wasteful: each iteration discards what the previous one found. It is not. Because the nodes at the deepest level dominate the count, regenerating shallow nodes is a constant-factor overhead. The numbers work out beautifully.

For b = 10 and d = 5:

BFS generates roughly 111,111 nodes (1 + 10 + 100 + 1000 + 10000 + 100000).
IDS generates roughly 123,456 nodes (same nodes regenerated, plus an extra factor for the iterations).

The overhead is about 11%. For larger b the ratio gets better, not worse. And IDS uses only O(b·d) memory, far less than BFS.

iteration: l = 0

Press Step to begin. Each iteration restarts from the root with a deeper limit. Watch the iteration counter advance, and notice how the same shallow nodes keep being regenerated.

Trace · iteration | popped | fringe

Iter	Popped	Visited (this iter)	Fringe

Why IDS has these properties

Complete

Yes

If a goal exists at any finite depth d, the iteration with l = d will find it. The outer loop guarantees that every finite depth is eventually tried.

Optimal

Only when step cost is the same for all actions

Same caveat as BFS. IDS returns the shallowest goal; that is optimal in cost only when all step costs are equal.

Time

O(b^d)

Node at depth 0 generated d+1 times, depth 1 generated d times, …, depth d generated once. Total = (d+1) + d·b + (d−1)·b² + … + 1·b^d. The deepest term dominates: this is O(b^d). The constant factor for b ≥ 10 is small.

Space

O(b·d)

Each iteration is DLS, which uses linear space. Between iterations, nothing is kept. IDS achieves BFS's completeness at DFS's memory cost.

TakeawayIDS is generally the preferred uninformed strategy when there is a large search space and the depth of the solution is not known. It is the default choice in many real systems.

6. Bidirectional Search البحث ثنائيّ الاتّجاه

Idea

Run two breadth-first searches at once: one forward from the start, one backward from the goal. Stop as soon as the two fringes meet. The two half-trees together cover the path.

Two fringes (FIFO each) · termination on intersection of the two visited sets.

The motivation is arithmetic. BFS in one direction generates O(b^d) nodes; two BFS searches that meet in the middle generate two sub-trees of depth d/2 each, giving O(b^d/2) nodes total. For b = 10, d = 6, this is the difference between a million nodes and two thousand.

Press Step to begin. Odd steps expand the forward fringe (red, from S); even steps the backward fringe (green, from G). The search terminates as soon as a node appears in both visited sets.

Trace · step | direction | both fringes

Step	Direction	Visited (F & B)	Popped · Fringes

Why bidirectional search has these properties

Complete

Yes, provided both searches are BFS

Each half is a BFS in a finite-branching graph; both terminate. Once the two fringes intersect, a path exists through the meeting node.

Optimal

Optimal in number of steps when both halves use BFS and step costs are equal

Returns the shallowest meeting node. If costs vary, the meeting node may give a path that is not the cheapest. UCS-style bidirectional search is much harder to implement.

Time

O(b^d/2)

Two sub-trees each of depth d/2. The reduction is enormous: square root in the worst case.

Space

O(b^d/2)

At least one of the two fringes must be held entirely in memory to detect intersections. The space requirement is the main practical limitation.

Three practical limitations

Computing predecessors is sometimes hard. In chess, given a board, what positions could have led to it? Many, and computing them is its own problem.
When the goal is described abstractly (such as "checkmate") rather than as a specific state, the backward search has no single starting point.
Bookkeeping is more complex than for any of the other algorithms here, which makes implementation error-prone.

Part Eight

Informed Search

strategies that use a clue about the goal

All six uninformed algorithms suffered from one limitation: they cannot tell which of two fringe nodes is closer to the goal. An informed agent uses a problem-specific function h(n), a heuristic, that estimates the remaining cost from n to a goal. Even a rough estimate can transform the search.

The heuristic function · h(n) دالّة الاستدلال

A heuristicA problem-specific function that estimates the cost from a state to a goal. From the Greek heuriskein, "to find out".دالّة استدلاليّة / إرشاديّة is any function the designer chooses that estimates remaining cost. The art is to choose h so that it is informative (close to the truth) but easy to compute.

For route-finding on the Romania map, the canonical heuristic is the straight-line distanceThe Euclidean distance from a city to the goal, ignoring roads.المسافة المستقيمة from each city to Bucharest:

Romania · h_SLD(n) = straight-line distance to Bucharest (km)

Notice that h is not the same as the true road distance. From Arad, the straight line to Bucharest is 366 km, but the shortest road route is 418 km. That is normal: heuristics are estimates. What matters is whether they are useful, and whether they are admissible (we will see what this means in a moment).

Best-First Search · the general idea البحث بأولويّة الأفضل

Best-First SearchA family of search algorithms that selects the next node to expand by minimising an evaluation function f(n).البحث بأولويّة الأفضل is the family of algorithms that order the fringe by some evaluation function f(n). Different choices of f give different algorithms:

f(n) = h(n) ⇒ Greedy Best-First Search

f(n) = g(n) + h(n) ⇒ A* Search

where g(n) is the cost from the start to n (the same g as in UCS). The next two sub-sections work through each in turn.

1. Greedy Best-First Search البحث الجشِع بأولويّة الأفضل

Idea

Ignore the cost already paid; expand the node whose estimated remaining cost is smallest. f(n) = h(n). It is the spatial analogue of "head straight for the destination".

Fringe data structure: priority queue ordered by h(n).

Click any city to see its h. The dashed line is the straight-line heuristic to Bucharest; greedy picks whichever fringe city has the smallest such dashed line. Click again to clear.

Press Step to begin. Watch greedy expand Arad → Sibiu → Fagaras → Bucharest. It picks Fagaras over Rimnicu because Fagaras has smaller h. The total path cost will be 450 km. The true optimum is 418 km, greedy will miss it.

Trace · step | popped | fringe with h(n)

Step	Popped	Visited (closed list)	Fringe (lowest h first)

Why Greedy Best-First has these properties

Complete

No, can get stuck in loops

Classic trap: starting from Iasi heading to Fagaras, h is minimised by Neamt, but Neamt is a dead end. Without a closed list, greedy bounces Iasi ↔ Neamt forever. With graph-search it terminates but on a graph with no path it may report failure even when a path exists in some embeddings.

Optimal

No

Greedy makes locally optimal choices without considering the cost of getting to the node. It returns the first goal it reaches, which need not be the cheapest. The Arad-to-Bucharest example above is the classic illustration.

Time

O(b^m)

In the worst case greedy is no better than DFS. A good heuristic can give massive practical speedups, but the worst-case bound is the same.

Space

O(b^m)

Greedy keeps every generated node on the fringe (unlike DFS). Space matches time.

Worked failure: Iasi to Fagaras

The canonical demonstration that Greedy is incomplete. Set start = Iasi, goal = Fagaras. The straight-line distance from Neamt to Fagaras happens to be smaller than from Iasi to Fagaras (Neamt sits roughly between Iasi and Fagaras as the crow flies, even though by road it is a dead-end).

Greedy's behaviour:

Start at Iasi. Generate neighbours: Vaslui, Neamt.
Compare heuristics. h(Neamt → Fagaras) < h(Vaslui → Fagaras). Greedy picks Neamt.
Expand Neamt. Its only neighbour is Iasi (a dead-end town).
Tree-search: Iasi is regenerated and looks attractive again. Greedy picks Iasi, generates Neamt, picks Neamt, generates Iasi… an infinite loop. Tree-search Greedy is not complete.
Graph-search: Iasi and Neamt are both closed after one pass. The fringe ran out of unexpanded options for that side of the map; Greedy returns failure or backs up to Vaslui (which is on the fringe), then continues. Graph-search prevents the loop but the algorithm still committed to a wrong side of the map first.

Lesson: a heuristic that points toward a dead-end is enough to derail Greedy. The closed list rescues termination but not optimality, and not the wasted exploration.

2. A* Search خوارزميّة A*

Idea

Greedy's flaw is that it ignores cost so far. A* fixes it by adding the cost already paid to the estimated cost remaining: f(n) = g(n) + h(n). This is the estimated total cost of the cheapest solution through n.

Fringe data structure: priority queue ordered by f(n) = g(n) + h(n).

f(n) = g(n) + h(n)
cost so far + estimated cost remaining = estimated total cost

Click any city on the map to inspect that city's g, h, and f. The solid gold lines trace the cheapest road route from Arad; the dashed line is the straight-line heuristic to Bucharest. Click the same city again to clear.

Press Step to begin. A* finds Arad → Sibiu → Rimnicu → Pitesti → Bucharest, total 418 km. Compare to greedy's 450. A* explores a few extra nodes, but returns the optimal solution. Watch the f-values in the trace as the fringe reorders.

Trace · step | popped | fringe with f = g + h

Step	Popped	Visited (closed list)	Fringe (lowest f first)

Why A* has these properties (assuming h is admissible)

Complete

Yes, unless infinitely many nodes have f(n) ≤ f(goal)

A* expands nodes in increasing order of f. As long as only finitely many nodes have f-value below the optimum, A* reaches a goal. With positive step costs and a finite branching factor, this condition holds.

Optimal

Yes, provided h is admissible

See the theorem and proof sketch below. Admissibility means h never overestimates: A* therefore never commits to a suboptimal goal because some better fringe node always has lower f.

Time

Exponential in the worst case

A* expands every node with f(n) ≤ C* (the optimal cost). For a poor heuristic this can be most of the tree; for a perfect heuristic, only the optimal path. In practice the heuristic determines whether A* is fast or hopeless.

Space

Keeps every generated node in memory

Like UCS, A* must remember every generated node to check whether a better path has been found. This is A*'s real weakness: for hard problems, A* typically runs out of memory long before it runs out of time. See the memory-bounded variants below.

Contrast: A* on the same Iasi-to-Fagaras problem

In the Greedy section we saw that ordering the fringe by h alone derails on Iasi → Fagaras because Neamt's h is misleading. A* on the same problem uses f = g + h instead:

From Iasi, A* generates Neamt and Vaslui. Neamt has a small h but g(Neamt) accumulates 87 km from Iasi. Once g is added the f-value of any path through Neamt is dominated by the round-trip back through Iasi.
Any path passing through Neamt twice (which is the only way out, since Neamt is a dead-end) has cumulative g ≥ 2 × 87 + cost(Iasi to anywhere else). After Neamt is expanded once and provides no progress, A*'s priority queue prefers Vaslui's branch.
A* commits to the correct side of the map after only the misleading detour through Neamt is shown not to improve the f-value, and returns the optimal route Iasi → Vaslui → Urziceni → Bucharest → Fagaras.

The g term is what saves A* here. Greedy throws this information away; A* keeps it.

Admissibility · why A* is optimal قابليّة القبول

A heuristic h(n) is admissibleA heuristic h is admissible if h(n) ≤ h*(n) for every node n, never overestimates the true remaining cost.مقبولة (لا تبالغ في التقدير) if, for every node n, h(n) does not overestimate the true cost h*(n) to reach the nearest goal:

h(n) ≤ h*(n) for every node n ⇒ h is admissible (optimistic)

Straight-line distance is admissible for route-finding because, by the triangle inequality, no road between two cities can be shorter than the straight line between them. The estimate cannot lie about the destination being too far away; at worst it lies about it being too close.

Theorem · A* is optimal

If h is admissible, then A* using tree-search is optimal.

Proof sketch

Suppose some suboptimal goal G₂ has been generated and sits on the fringe. Let n be an unexpanded fringe node on a shortest path to an optimal goal G with cost C*. We will show that A* prefers n to G₂, so it will never return G₂.

f(G₂) = g(G₂) + h(G₂) = g(G₂) since h(goal) = 0.
g(G₂) > C* since G₂ is suboptimal.
So f(G₂) > C*.
f(n) = g(n) + h(n) ≤ g(n) + h*(n) = C* by admissibility.
Therefore f(n) ≤ C* < f(G₂): A* will expand n before G₂.

Since this holds for every suboptimal goal on the fringe, A* must select an optimal goal first.

Dominance الهيمنة

If two admissible heuristics h₁ and h₂ satisfy h₂(n) ≥ h₁(n) for every n, then h₂ dominates h₁. A* with h₂ expands no more nodes than A* with h₁. For admissible heuristics, bigger is better, closer to the true cost means fewer nodes explored. Russell & Norvig give the 8-puzzle numbers as a vivid illustration:

Typical 8-puzzle search costs · R&N Figure 4.8

Depth d	IDS (uninformed)	A* with h₁ (misplaced tiles)	A* with h₂ (Manhattan)
12	3,644,035	227	73
24	too many	39,135	1,641

At depth 24, IDS becomes infeasible. A* with the weak heuristic explores nearly 40,000 nodes. A* with the dominant heuristic explores only 1,641. The heuristic does the work that brute force cannot.

Where do good heuristics come from? المسائل المُخفّفة

One reliable recipe: relax the problemReplace the real problem with a version that has fewer restrictions on the actions. The cost of an optimal solution to the relaxed problem is an admissible heuristic for the real problem.مسألة مُخفَّفة. Drop a restriction on the actions, compute the optimal cost for the relaxed problem, use that cost as h.

Relax the 8-puzzle so a tile may move to any square (not just an adjacent empty one): optimal cost is h₁ (misplaced tiles).
Relax it so a tile may move to any adjacent square ignoring whether it is occupied: optimal cost is h₂ (Manhattan distance).

Both are admissible because removing constraints can only make a problem easier; an optimal solution to the relaxed problem cannot cost more than an optimal solution to the original.

Pruning in action · a 5×5 maze التهذيب في العمل

The clearest way to see what the heuristic is doing for A* is to put A* next to an uninformed algorithm on the same problem and watch what each one expands. A grid maze makes this visible at a glance.

Rules of the maze

5 × 5 grid. S sits in the top-left cell (row 0, column 0); G sits in the bottom-right cell (row 4, column 4).
Walls (dark navy squares) are blocked. The agent cannot enter, leave through, or pass over them. They are not just unexplored, they are unavailable as states. The walls in rows 1 and 3 split the maze into three vertical lanes joined only through column 2.
Movement is four-directional, up / down / left / right, between adjacent open cells. No diagonals.
Every move costs 1. Step cost is uniform. The cost of a path is its number of moves.
The number printed inside each open cell is h(cell), the heuristic estimate (defined next). S and G are labelled by name instead of by their h-values; h(S) = 8 and h(G) = 0 by definition.

Manhattan distance, the heuristic

Manhattan distanceFor grid cells (r, c) and (r', c'), |r − r'| + |c − c'|. The minimum number of horizontal-plus-vertical steps needed if there were no walls.مسافة مانهاتن from cell (r, c) to cell (r', c') is

|r − r'| + |c − c'|

that is, the total number of vertical-plus-horizontal moves you would need on an obstacle-free grid. In this maze G is at row 4, column 4, so the heuristic at cell (r, c) is

h(r, c) = (4 − r) + (4 − c)

Worked examples from the grid below:

h(S) = h(0, 0) = 4 + 4 = 8. (And h*(S) = 8 as well, the true optimal cost; on this maze the heuristic is tight at S.)
h(2, 1) = (4 − 2) + (4 − 1) = 2 + 3 = 5. But the real shortest cost from (2, 1) to G is 7 (forced detour through column 2). h underestimates by 2.
h(G) = 0. The goal is always 0 distance from itself.

On a grid where every step costs 1, Manhattan is admissible: walls can only make the real path longer than the obstacle-free count, never shorter, so h(n) ≤ h*(n) always holds.

Colour key on the cells

Dark navy filled: the start S, or a wall (blocked, unavailable).
Maroon filled with white letter: the goal G, or the cell the algorithm is currently popping.
Light maroon: cells already expanded (the closed list).
Yellow with dashed border: cells on the fringe, generated but not yet popped.
Gold: cells on the path the algorithm finally returns.
Plain cream background: cells the algorithm never reached. This is the pruned set, the cells A* refused to expand because the heuristic ruled them out.

Tie-breakingMany cells in this maze have the same f-value (the heuristic is tight on the optimal trajectory, so f = 8 along the whole way). When the priority queue has multiple cells with equal f, A* breaks ties by insertion order (first in, first out). That is why A* can pop one f = 8 cell, then a non-adjacent f = 8 cell next: the second cell was already on the fringe, queued behind the first. The algorithm is not "jumping"; it is draining its tied-priority bucket.

Press Step on each side to advance one expansion at a time. Watch which cells each algorithm leaves untouched.

A* with h = Manhattan

BFS, uninformed

What to see

Both find the same optimal cost. The path length is identical because the heuristic is admissible.
A* never touches the bottom-left pocket. Cells (2, 0), (2, 1), (4, 0), (4, 1) sit at row-column positions whose g + h exceeds C*. A* refuses to expand them because it has a cheaper goal-cost upper bound from its right-lane progress.
BFS expands them anyway. BFS knows nothing about the goal's location; from its point of view a cell at depth 6 is just as worthy of expansion as any other cell at depth 6, even if Manhattan says that cell is on the wrong side of the maze.
The difference is the pruning. The cells A* never colours are the cells the heuristic ruled out. Make the heuristic worse and that pruned set shrinks; make it perfect and A* would touch only the path itself.

The complexity of A*, in depth التعقيد بالتفصيل

Time and space costs for A* are not single numbers. They range across a spectrum decided almost entirely by the heuristic. The same A* algorithm can be near-linear on one problem and ruinously exponential on the next.

Time complexity: three cases

A* expands nodes in order of f(n) = g(n) + h(n). What we want to count is how many nodes it expands before the goal is popped. Space tracks the same number because A* stores every generated node. The question is: how does this count grow as problems get harder (deeper solutions, larger branching factor)? The answer depends entirely on how good the heuristic is. Three cases.

Case 1 · perfect heuristic → linear time

Suppose h(n) = h*(n) for every node n. The heuristic returns the exact true remaining cost, with zero error.

Then for every node on an optimal path, f(n) = g(n) + h*(n) = C*; for every node off an optimal path, f(n) > C*. Since A* expands nodes in order of f, and every optimal-path node has f = C* while every off-path node has f > C*, A* expands only the nodes on the optimal path. Exactly d + 1 expansions where d is the solution depth. Linear in d, with no wasted work.

Intuitively: a perfect h is a perfect oracle. At each node it whispers "go this way, not that way" and never lies; A* just follows the breadcrumb trail.

This case is unattainable in practice (if you had a perfect h you would not need to search at all, you would just compute the answer), but it is the theoretical floor.

Case 2 · zero heuristic → exponential time

Suppose h(n) = 0 everywhere. Then f(n) = g(n), and A* degenerates into Uniform-Cost Search. UCS explores in every direction equally, like ripples on a pond, because no fringe node looks more promising than any other.

UCS expands roughly every node with g ≤ C*. In a tree with branching factor b and solution depth d that is about b^d nodes. This is the worst case: A* with the worst possible admissible heuristic.

Concretely on Romania: with h = 0, A* would expand outward from Arad equally in every direction, toward Zerind, Timisoara, Sibiu, with no preference, and would explore the entire road network cheaper than 418 km before settling on the answer.

Case 3 · real heuristic → somewhere between

Real heuristics sit between perfect and useless. The question is how much do they prune? There is a precise bound. The number of nodes A* expands is bounded by the size of the set

{ n : f(n) ≤ C* }

These are the nodes A* cannot rule out. It must expand each of them before it can be certain that the goal at cost C* is optimal.

The size of this set depends on the heuristic error, the gap between h(n) and h*(n). AIMA introduces the effective branching factor b*If A* expands N nodes on a problem with solution depth d, the effective branching factor b* is the value that makes a uniform tree of depth d and branching factor b* have N+1 nodes. Closer to 1 means the heuristic prunes very well; closer to b means it barely helps.عامل التفرع الفعّال such that A* expands roughly (b*)^d nodes.

Perfect h: b* → 1, near-linear.
Zero h: b* = b, full exponential.
Real h: somewhere between.

For the 8-puzzle with h₂ (Manhattan), the empirical b* is around 1.4, dramatically smaller than the underlying branching factor b = 3 of unrestricted moves. That is why A* solves 8-puzzles that uninformed search cannot.

Space complexity, and why it bites first

A* keeps every generated node in memory: the priority-queue fringe, the closed list, and the search-tree pointers needed to reconstruct the returned path. With b^d generated nodes in the worst case, memory grows exponentially too. The space and time bounds are the same expression.

Here is the practical truth that makes A* hard on large problems: memory runs out before time does. A modern CPU does several million A* expansions per second, but every expanded node carries roughly 50 to 100 bytes of bookkeeping (state representation, parent pointer, g, h, and the priority-queue link). With 8 GB of RAM you can hold around 10⁸ nodes. On a problem with b = 10, d = 12, A* can generate 10¹³ nodes in the worst case, a hundred thousand times more than fits. The algorithm crashes out of memory long before it would have finished computing.

This is why R&N call A* "memory-bound" rather than "time-bound" for hard problems, and why the memory-bounded variants RBFS and SMA* exist: they trade re-computation for a memory budget you can actually afford. They are introduced in Part Nine.

A quiet caveatThe bounds above assume tree-search. In graph-search A* benefits from skipping repeated states, but the worst-case complexity in terms of generated nodes is unchanged: the state space could still have b^d distinct states. Graph-search saves you from re-exploring loops, not from the underlying combinatorial explosion.

Exercise · the coordinate graph تمرين تطبيقيّ

An abstract graph with nodes A through K placed on a 2D grid. The straight-line distance from each node to goal K serves as h(n). Step through Greedy and A* in turn and compare the paths and the number of nodes expanded.

Click any node to inspect its g, h, and f. The solid gold lines are the cheapest route from A; the dashed line is the straight-line heuristic to K.

Switch tabs and compare. Run greedy first, then A*, on the same graph. Both find the same path A → D → F → I → K, but the order in which they expand intermediate nodes differs.

Coordinates and h(n) = straight-line distance to K

Node	(x, y)	h(n)

Trace

Step	Popped	Fringe

Exercise · 8-puzzle heuristics استدلالات لعبة الثمانية

For any 8-puzzle state, compute two admissible heuristics:

h₁ = number of tiles in the wrong position (the blank does not count).
h₂ = sum, over all tiles, of the Manhattan distance from the tile's current square to its goal square.

Below: a state with red shading on misplaced tiles and a small Manhattan badge on each tile showing its distance from home. Click any tile adjacent to the blank to slide it and watch the heuristics update in real time.

Current state

Goal

1

2

3

4

5

6

7

8

Heuristic values

h₁ (misplaced tiles) = ,

h₂ (Manhattan) = ,

h₂(n) ≥ h₁(n) for every state, so h₂ dominates h₁. A* with h₂ will expand fewer nodes.

Part Nine · Advanced

Memory-Bounded Heuristic Search

when A* runs out of RAM

A*'s real-world bottleneck is space, not time. The fringe grows exponentially; modern hardware fills its memory long before the algorithm finishes computing. Before we meet the refinements that fix this, it is worth being precise about what A* is storing and why none of it can be discarded.

The problem with A* المشكلة في الذاكرة

What A* keeps, per node

For every state it has ever generated, A* stores a small record. Roughly:

State: the configuration itself (the 3 × 3 grid for the 8-puzzle, the city name for Romania, the (row, column) cell for a grid maze).
Parent pointer: which node it was reached from. The returned path is recovered by walking this chain back to the start.
g(n): the actual path cost from the start to this node.
h(n): the heuristic estimate from this node to a goal.
f(n) = g(n) + h(n): the key used to order the priority queue.

Concretely on the 8-puzzle: the state takes about 9 bytes (one nibble per tile), the parent pointer about 8 bytes, three integers about 12 bytes, plus 16 to 32 bytes of priority-queue and hash-table overhead. Roughly 40 to 100 bytes per node, depending on the implementation.

Where the nodes live: two structures

A* keeps these records in two collections that both grow monotonically until the goal is popped. Nothing is freed in between:

Frontier (the priority queue): every generated-but-not-yet-expanded node, ordered by f(n).
Reached (the closed list): every state that has already been expanded, kept so repeats can be recognised and skipped.

Why none of this can be thrown away

Students usually ask: cannot A* just discard the bad-looking nodes? For unmodified A* the answer is no, and there are three independent reasons.

The reached set is needed to avoid re-expansion. Without the closed list, the same state can be reached via a longer path, expanded a second time, its children regenerated, expanded again, and so on. State spaces with loops or many paths to the same configuration explode exponentially when the closed list is absent.
The frontier is needed for optimality. A* always expands the lowest-f node next. That might currently be node D with f = 14. But after expanding D, every child of D might have f = 20, and the next-best candidate becomes node E with f = 17. If A* had thrown E away to save space, it would never reach E and would miss the path through it. The whole frontier is on the critical path; there is no "obviously useless" subset to drop.
Parent pointers are needed to return the answer. When A* finally pops the goal, the only way to recover the sequence of actions is to walk the parent chain back to the start. Drop the pointers and A* knows the optimal cost but cannot report the optimal path.

How fast does the memory bill grow?

Each expansion pops one node and pushes up to b children. Memory grows by roughly (b − 1) nodes per expansion. The number of expansions itself grows roughly as (b*)^d, where b* is the effective branching factor introduced in Part Eight. For a hard instance of the 15-puzzle with b ≈ 3 and depth around 50, the table looks like:

Expansions done	Nodes in memory	RAM at ~50 bytes/node
10²	~300	15 KB
10⁴	~30,000	1.5 MB
10⁶	~3,000,000	150 MB
10⁸	~300,000,000	~14 GB

A hard 15-puzzle instance typically requires 10⁷ to 10⁸ A* expansions. The last row exceeds the RAM of a typical laptop. A* swaps to disk, thrashes, and eventually crashes long before it would have finished computing the answer. This is the practical face of the worst-case bound from Part Eight, with concrete numbers attached.

What the variants buy

The two algorithms in this section take different routes to a memory-bounded A*.

RBFS, next, keeps only the current depth-first path plus the second-best f-value at each ancestor. On the same 15-puzzle instance that is roughly 50 × (3 × 50 bytes) ≈ 7 KB instead of 14 GB. About two million times less memory for the same optimal answer, at the cost of regenerating nodes when the search backtracks.
SMA*, after that, takes a different route: keep as much of A*'s memory profile as fits in a chosen budget of N nodes, evict the worst when full, regenerate later if needed. The budget is the dial; the answer quality degrades gracefully as it shrinks.

Both rest on the same trick that IDS used for uninformed search: iterate, with the cutoff defined by f-cost instead of depth. They differ in how aggressively they use whatever memory is actually available.

1. Recursive Best-First Search · RBFS البحث التكراريّ بأولويّة الأفضل

RBFS mimics A* but uses linear space, like DFS. It keeps a recursion stack of the current path. At each node, it remembers the second-best f-value among its alternative branches. If the best path under the current branch ever exceeds that second-best, RBFS abandons the branch, backs up, and resumes the alternative.

function RBFS(node, f_limit) returns solution, new f_limit if GOAL-TEST(node) then return solution successors ← EXPAND(node) if successors is empty then return failure, ∞ for each s in successors: f(s) ← max(g(s) + h(s), f(node)) // pathmax loop do best ← lowest-f node in successors if f(best) > f_limit then return failure, f(best) alternative ← second-lowest f-value among successors result, f(best) ← RBFS(best, min(f_limit, alternative)) if result ≠ failure then return result

Why RBFS has these properties

Complete

Yes, provided h is admissible

RBFS eventually explores every node whose f-value is below the optimum. With admissible h and a finite state space, the search terminates with a goal.

Optimal

Yes, provided h is admissible

RBFS inherits A*'s optimality argument. The pathmax step ensures f-values are monotonically non-decreasing along any path, which preserves the proof.

Time

Exponential, often worse than A*

RBFS regenerates the same nodes many times as it switches between branches. Each "change of mind" can re-explore an entire subtree. With more memory it has no way to do better.

Space

O(b·d)

Just the recursion stack plus, at each level, the second-best f-value. Linear in depth, just like DFS.

IntuitionRBFS keeps changing its mind. As exploration nears the goal, f = g + h becomes less optimistic, so old branches start to look attractive again and the algorithm hops back. The hopping is the price of using almost no memory.

2. Simplified Memory-Bounded A* · SMA* A* بذاكرة محدودة

SMA* uses all available memory, not just linear. It runs A* until memory is full; then it drops the worst leaf (highest f-value) and remembers that subtree's best f-value in the parent. When that subtree later becomes attractive again, SMA* regenerates it.

SMA* in three rulesThe pseudocode below packs a lot in; the mental model is simpler.

Run A* until memory is full. With the memory budget N, store at most N nodes at any time.
When the budget is exceeded, drop the worst leaf. The "worst" leaf is the one with the highest f-value among the leaves of the current search tree. Before dropping it, record its f-value in its parent so the parent knows "I had a forgotten descendant with f = this".
Regenerate when the forgotten subtree becomes attractive again. If, later, every node currently in memory has an f-value worse than some ancestor's "forgotten f", SMA* regenerates that ancestor's dropped subtree, expanding it again from scratch. The price of a small memory budget is paid in re-computation.

The "deepest least-f-cost node" rule (line 4 of the pseudocode) breaks ties by preferring nodes that are deeper in the tree. This favours finishing a promising branch over scattering effort across the frontier, which helps when memory is tight.

function SMA*(problem) returns a solution Queue ← MAKE-QUEUE({MAKE-NODE(INITIAL-STATE[problem])}) loop do if Queue is empty then return failure n ← deepest least-f-cost node in Queue if GOAL-TEST(n) then return success s ← NEXT-SUCCESSOR(n) if s is not a goal and s is at maximum depth then f(s) ← ∞ else f(s) ← max(f(n), g(s) + h(s)) if all of n's successors have been generated then update n's f-cost and ancestors as needed if SUCCESSORS(n) all in memory then remove n from Queue if memory is full then delete shallowest highest-f-cost node in Queue remove it from its parent's successor list insert its parent back on Queue if necessary insert s in Queue

Why SMA* has these properties

Complete

If memory ≥ depth of shallowest solution

SMA* needs enough memory to hold a complete path from root to a goal. If the budget is smaller, it cannot return any solution.

Optimal

If memory ≥ depth of shallowest optimal solution

Same condition as completeness, but for the optimal path. If the budget is even smaller, SMA* returns the best reachable solution that fits in memory, graceful degradation rather than failure.

Time

Exponential in the worst case

If the memory budget is too small, SMA* may thrash: re-expanding the same subtree repeatedly because everything keeps getting evicted. With enough memory it matches A*; with very little it can be far worse.

Space

Bounded by the given budget

By construction. SMA* simply never lets the queue exceed the budget. This is its whole purpose.

In practice, SMA* is used when an A*-style optimality guarantee is desired but the problem is too large for full A*. For very hard problems, even SMA* may degrade to thrashing between subtrees that all exceed the memory budget, at which point a different approach (such as local search, below) becomes necessary.

Part Ten

Local Search

when the path doesn't matter, only the destination

Every algorithm so far has been a systematic search: it remembers the path, generates a tree, and returns the sequence of actions that solve the problem. For many real problems, scheduling, layout, n-queens, routing, the path is irrelevant. We just want a state that satisfies the goal. Local search abandons the path entirely. It keeps just the current state and tries to improve it.

Key ideaLocal search algorithms operate using only a single current node (rather than multiple paths) and generally move only to neighbours of that node. They use very little memory, usually a constant amount, and they can find reasonable solutions in large or infinite state spaces where systematic algorithms are unsuitable.

Problems suited to local search المسائل المناسبة للبحث المحلّيّ

Local search shines on a particular shape of problem. Before we look at the algorithms, three ingredients have to be on the table.

1. A candidate-solution state

A state in local search is a complete candidate answer to the problem, not a partial path being grown. The size of the state is the size of the answer.

n-queens: an array of n integers, one per column, giving the row of that column's queen. A state is an entire board configuration with all n queens placed.
Travelling salesperson (TSP): a permutation of the cities. A state is a complete tour.
Scheduling: an assignment of every task to a time slot and a machine. A state is a full schedule.
VLSI placement: an x/y position for every component on the chip. A state is a complete layout.

This contrasts sharply with systematic search, where a state is a node on a path from start to goal. In local search the "path" is meaningless; only the candidate at the current step matters.

2. A neighbour function

Given the current state, the neighbour function (or successor function) returns all states reachable by a single small change.

n-queens: move one queen to a different row in its column. n queens × (n − 1) other rows = n(n−1) neighbours. For n = 8 that is 56.
TSP: swap two cities in the tour, or reverse a contiguous segment. Each gives O(n²) neighbours.
Scheduling: move one task to a different slot, or swap two tasks. Many neighbours per state.

The neighbour function defines the topology of the search landscape. A larger neighbourhood gives the algorithm more options at each step, but also makes each step more expensive.

3. An objective function

The objective function (also called the fitness function in evolutionary algorithms, or the cost function when minimising) assigns a numeric quality to every state. Local search uses this number to compare candidates and decide what to do next.

n-queens (cost view): h(state) = number of pairs of queens attacking each other. Minimise to 0.
n-queens (fitness view): f(state) = number of non-attacking pairs out of ⁿC₂. Maximise to ⁿC₂ (28 for n = 8).
TSP: minimise total tour length.
Scheduling: minimise makespan, or weighted lateness.

The maximise and minimise views are interchangeable; the algorithms below are written for maximisation (higher VALUE = better), and the 8-queens example minimises h. Mentally flip the sign and the algorithms behave identically.

When local search winsLocal search is the right tool when (a) the path to the answer is irrelevant, only the answer matters; (b) the state space is too large to enumerate systematically; (c) you have a cheap-to-evaluate objective function; (d) memory is constrained. It is the wrong tool when you need a sequence of actions (driving directions, planning a sequence of moves), when the objective is deceptive (no signal toward the global optimum), or when you need a provable optimum.

The state-space landscape منظر فضاء الحالات

Picture the state space as a one-dimensional landscape: the x-axis is the state, the y-axis is an objective function we want to maximise (or, equivalently, a cost we want to minimise; the two views are interchangeable by flipping the sign). Each point is a state; the height is its quality. The shape of the landscape decides how easy or hard the problem is.

Notice that the maxima (the peaks) are curves, smooth high points where the gradient briefly hits zero before reversing. They are not flat. Only the shoulder and the plateau are flat in the diagram. The wording below follows Russell & Norvig and Jarrar's Ch4 notes; the bracket markers under the diagram show the exact extent of each flat region.

Global maximum القيمة العظمى الشاملة: the highest objective value anywhere in the state space. The destination. Several different states can share this value, but the value itself is unique.
Local maximum القيمة العظمى المحلّيّة: a state whose objective value is higher than that of every immediate neighbour but lower than the global maximum. Hill climbing stops here because no neighbour is strictly better, even though the algorithm has not reached the true peak.
Plateau هضبة: a flat region of the landscape where every neighbour has the same objective value. No gradient, no preferred direction. The strict-improvement rule of basic hill climbing terminates immediately. The plateau in the diagram (states 540 to 640) is genuinely flat; the curve does not dip or bump inside it.
Shoulder كتف: a flat region that does have an exit, one edge of the flat leads to higher terrain. A hill climbing variant that allows sideways moves can cross the shoulder and continue up; the basic strict-improvement variant cannot.

A few notes on what the 1D diagram does not show.

The valleys between peaks are local minima القيمة الصغرى المحلّيّة. When the problem is phrased as minimising a cost (the 8-queens case below, with h = attacking pairs), maxima and minima swap roles: we are descending the landscape, and the local minimum is what the algorithm gets stuck on.
A ridge تلّ ضيّق is intrinsically two-dimensional, a long, narrow crest where moving along the crest is uphill, but every single-step neighbour move is sideways or downhill. Single-step hill climbing oscillates and makes no progress along the ridge. 1D illustrations cannot show this; it requires at least two state-space dimensions.

Maximise or minimise, pick oneThe algorithms below are written for maximisation (higher VALUE is better). The 8-queens example throughout this section is written for minimisation (h = number of attacking pairs, we want zero). The two views are identical up to a sign flip. Russell & Norvig use "VALUE" for the maximised version and "h" or "cost" for the minimised one; Jarrar's Ch4 slides do the same.

Worked example · 8-queens مسألة الملِكات الثماني

Throughout this section we use the 8-queens problem: place eight queens on a chessboard so that no two attack each other. Local search variants encode the state as eight numbers, one per column, each giving the row of that column's queen. This forces "one queen per column" by construction.

State: an array of 8 row numbers, one per column. So a state is something like (1, 6, 4, 7, 0, 3, 1, 5).
Successors: 8 × 7 = 56 neighbour states, obtained by moving one queen to a different row in its column.
Objective (to minimise): h = number of pairs of queens attacking each other (directly or diagonally). h = 0 is a solution.

The state space has 8⁸ ≈ 17 million states. Systematic search is feasible but slow; local search is much faster.

Reading h on the board

The board below shows one specific 8-queens state. The queens (Q in the dark circles) sit at the columns' current rows. The number printed in every other square is the h-value the state would have if the queen in that column moved to that square. So the 56 numbers represent the 56 possible single-queen moves. Hill climbing simply picks one (or all) of the cells with the smallest number, here highlighted in gold.

Notice that several different moves can yield the same minimum h. The algorithm needs a tie-breaking rule (Russell & Norvig: pick uniformly at random among the best). Notice also that no single move drops h to zero on this board, every smallest-h cell still leaves attacking pairs. Hill climbing must take several moves in sequence, and may not converge to a solution at all.

1. Hill Climbing تسلّق التلّ

Idea

Look at all neighbours of the current state. If any neighbour is better, move to the best one. If no neighbour is strictly better, stop, you are on a peak (which may be local or global). The algorithm is greedy and has no memory.

No fringe. The algorithm holds only the current state.

function HILL-CLIMBING(problem) returns a state that is a local maximum current ← MAKE-NODE(INITIAL-STATE[problem]) loop do neighbour ← a highest-valued successor of current if VALUE(neighbour) ≤ VALUE(current) then return current.STATE current ← neighbour

h = ?

Press Step to climb. The algorithm looks at all 56 neighbours of the current board (each one moves one queen within its column) and jumps to the one with the lowest h. It stops when no neighbour is strictly better.

Trace · step | move | h before → h after

Step	Move	h

The local-maxima problemHill climbing from a random 8-queens start gets stuck at a local maximum about 86% of the time (R&N, Ch. 4). It only solves 14% of problem instances. The fixes, random restarts, simulated annealing, beam search, and so on, are precisely what the rest of this section is about.

Variants of hill climbing

Stochastic hill climbing, choose at random among the uphill moves, weighted by steepness. Converges more slowly but finds better solutions on some landscapes.
First-choice hill climbing, generate successors one at a time at random, take the first one that is better. Useful when there are thousands of successors.
Random-restart hill climbing, if hill climbing fails, start again from a random state. If each attempt succeeds with probability p, expected number of restarts is 1/p. For 8-queens, p ≈ 0.14, so an average of seven restarts solves it.

Stopping criterionBasic hill climbing terminates as soon as no neighbour is strictly better than the current state. That state may be the global maximum (success), a local maximum (stuck), a plateau (stuck), or a shoulder (stuck unless sideways moves are allowed). Implementations sometimes add safety caps: a maximum number of steps, or a maximum number of consecutive sideways moves on a plateau before giving up.

Why hill climbing has these properties

Complete

No

Hill climbing terminates as soon as it reaches a state with no better neighbour. That state may be a local maximum, a plateau, or a ridge, not necessarily a global maximum. So the algorithm can return a non-solution.

Optimal

No

Even when it does find a solution, the solution is whichever local optimum it happened to climb to. There is no guarantee it is the best.

Time

Often fast in practice

Each step examines all successors and picks the best. The number of steps until termination is usually small (a few dozen for 8-queens). But there is no worst-case bound, with bad luck, a plateau can take exponential time.

Space

Constant

Only the current state is kept. The biggest practical advantage of local search.

2. Simulated Annealing المحاكاة بالصلب

Idea

Hill climbing never goes downhill, so it gets stuck. Simulated annealing sometimes accepts a worse move, with a probability that depends on (a) how much worse the move is and (b) a "temperature" parameter that starts high and cools down. Early in the search, the algorithm explores freely; later, it behaves like hill climbing.

Current state + a cooling schedule T(t) that decreases over time.

The analogy is from metallurgy: heating a metal and cooling it slowly lets atoms find a low-energy crystal arrangement. The algorithm mimics this: a hot system explores high-energy states; a cool system settles into a minimum.

function SIMULATED-ANNEALING(problem, schedule) returns a solution state current ← MAKE-NODE(INITIAL-STATE[problem]) for t = 1 to ∞ do T ← schedule(t) if T = 0 then return current next ← a randomly selected successor of current ΔE ← VALUE(next) − VALUE(current) if ΔE > 0 then current ← next // improvement, always take it else current ← next with probability e^ΔE/T // worse move, accept sometimes

The acceptance ruleIf the move is uphill (ΔE > 0), always take it. If the move is downhill (ΔE < 0), take it with probability e^ΔE/T. When T is high, this probability is near 1, almost anything is accepted. When T is low, even small downhill moves are almost never accepted, the algorithm becomes hill climbing.

Theoretical steps

Initialise. Pick a random starting state. Choose a starting temperature T₀ (large enough that most worse moves are accepted at first) and a cooling schedule, typically geometric: T(t + 1) = α T(t) with α close to 1, say 0.95.
Pick a random neighbour. Unlike hill climbing, SA does not look at all neighbours. It samples one at random.
Compute ΔE = VALUE(neighbour) − VALUE(current).
Decide. If ΔE > 0 accept the move (current ← neighbour). If ΔE ≤ 0 accept with probability e^ΔE/T; otherwise stay.
Cool. Lower T according to the schedule.
Repeat until the stopping criterion fires.

Worked exampleSuppose h(current) = 5, T = 2.0, and the random neighbour has h = 7 (worse, ΔE = −2 in maximisation language). Acceptance probability = e^−2/2 = e⁻¹ ≈ 0.37. Roll a uniform-random number r in [0, 1]; if r < 0.37, accept the worse move. Later, with T cooled to 0.2, the same ΔE = −2 gives e^−2/0.2 = e⁻¹⁰ ≈ 0.00005, almost never accepted. As temperature falls, SA grows pickier; the late-stage trajectory looks like hill climbing.

Stopping criterionThree common rules, often combined. (a) Temperature floor: T drops below a tiny threshold or reaches zero on the schedule. (b) Step budget: a fixed maximum number of iterations. (c) Solution found: the objective reaches the target (e.g., 8-queens h = 0). Some implementations also stop early if the best-seen state has not improved for some number of steps.

T = ? · h = ?

Press Step to anneal. Each step picks a random neighbour and decides whether to accept it. The temperature T starts high and decays geometrically. Worse moves are accepted often at high T, rarely at low T.

Trace · t | T | ΔE | accepted? | h

t	T	ΔE	?	h

Why simulated annealing has these properties

Complete

Probabilistically yes, with a slow enough schedule

If the cooling schedule decreases T slowly enough, logarithmically, then SA finds the global optimum with probability approaching 1. In practice, faster schedules are used, and the guarantee weakens to "very likely good".

Optimal

Same condition

SA's optimality is asymptotic and probabilistic. For any finite run, there is a non-zero chance of returning a sub-optimal state.

Time

Depends entirely on the schedule

The number of steps is fixed by the schedule. A slow schedule gives better answers but takes longer. Tuning the schedule is the main practical art.

Space

Constant

Same as hill climbing, just the current state and the current temperature.

Simulated annealing was developed at IBM in 1983 (Kirkpatrick et al.). It has been used for VLSI layout, factory scheduling, image processing, and protein folding. Its appeal is universality, the only problem-specific parts are the neighbour function and the objective.

3. Local Beam Search البحث الشعاعي المحليّ

Not examined in COMP338Local Beam Search is not part of Jarrar's Ch4 syllabus and is not required for the COMP338 exam. It is included here for completeness because it bridges naturally between hill climbing and genetic algorithms. Read it for context, skip it if pressed for time.

Idea

Hill climbing throws away everything except the single current state. Local beam search keeps k states at once. At each iteration, generate all successors of all k current states; from this combined pool, pick the k best to become the new generation.

k current states (the "beam"); successor pool combined across all of them.

Not k parallel hill climbsThis is the subtle distinction. Running k independent hill climbs is just hill climbing k times. Beam search is different: information passes between threads. If one of the k states finds a much better neighbour, its descendants quickly crowd out the others. The beam "concentrates" on promising areas.

function LOCAL-BEAM-SEARCH(problem, k) returns a solution state states ← k randomly generated states loop do if any of states is a goal then return that state successors ← ⋃ { successors of s : s ∈ states } states ← the k highest-valued successors

Beam k: best h = ?

Press Step to advance one beam generation. Each step: generate every successor of every state in the beam, then keep the k best across all of them. Watch what happens to the diversity of the beam as steps pass.

Trace · iter | best h | beam states (h values)

iter	best	beam (h values)

Stochastic beam search, a useful variant. Rather than taking the k best successors deterministically, sample k successors with probability proportional to their value. This avoids the beam collapsing onto a single area too quickly, and the analogy is suggestive: it is essentially asexual reproduction with selection pressure. Which brings us naturally to:

4. Genetic Algorithms الخوارزميّات الجينيّة

Idea

Like beam search, keep a population of states. Unlike beam search, generate the next generation by combining pairs of parents: choose two from the population (weighted by fitness), splice their state representations together (crossover), and occasionally flip a random gene (mutation). The hope is that recombination assembles good sub-solutions into a globally good solution.

Population of N individuals; fitness function; selection, crossover and mutation operators.

function GENETIC-ALGORITHM(population, FITNESS) returns an individual repeat new_population ← empty set for i = 1 to SIZE(population) do x ← RANDOM-SELECTION(population, FITNESS) y ← RANDOM-SELECTION(population, FITNESS) child ← REPRODUCE(x, y) if (small random probability) then child ← MUTATE(child) add child to new_population population ← new_population until some individual is fit enough, or enough time has elapsed return best individual in population

Representation mattersFor 8-queens, a state is naturally an 8-digit string (each digit is a row from 0-7). Crossover at position k means: child = first k digits of parent A, then last 8-k digits of parent B. The hope: good column-row pairings from each parent are preserved if the crossover point falls between them.

Vocabulary · chromosome, gene, alleleThe GA borrows three terms from biology to describe its representation. The metaphor is loose, but the terminology is universal in the literature and on exams.

A chromosome is a single candidate solution, encoded as a fixed-length string. It is the GA's analogue of a "state" in hill climbing: one complete answer that can be scored by the fitness function. For 8-queens, the 8-digit string 2 4 4 1 5 1 2 4 is one chromosome.
A gene is one position within the chromosome. The 8-queens chromosome has eight genes; gene i holds the row of the queen in column i. Genes are the atomic units that crossover splices between parents and that mutation may flip.
An allele is a particular value a gene can take. For 8-queens, each gene's possible alleles are the digits in [1, 8] (or [0, 7]). If every member of the population has the same allele at some gene, only mutation can reintroduce a different one.

The five operations of a GA

Every generation runs five operations. Understanding each in isolation is the key to understanding how GAs differ from other local-search methods.

Representation (encoding). Each candidate solution is encoded as a fixed-length string of symbols, the chromosome. Each position is a gene; each symbol value is an allele. For 8-queens the chromosome is eight digits, each in [1, 8] (or [0, 7]), giving the row of that column's queen. For a problem to suit a GA, an encoding has to exist whose substrings capture meaningful sub-solutions that crossover can preserve.
Fitness function. Numeric measure of how good a chromosome is. For 8-queens, a natural fitness is the number of non-attacking pairs of queens, in [0, ⁸C₂] = [0, 28]. Fitness 28 means a solution; lower values are worse.
Selection. Pick pairs of parents from the current population, with selection probability proportional to fitness. The standard formula for individual i with fitness f_i in a population of N is

p_i = f_i / Σ_j=1..N f_j

Higher-fitness individuals are more likely to be selected, but lower-fitness ones still have a chance (which preserves population diversity). This scheme is called roulette-wheel selection: imagine a wheel divided into N slices, with each slice's width proportional to that individual's fitness. Spin twice to get two parents.

Crossover (recombination). Combine two parents into two children. Pick a random crossover point k in [1, L − 1] where L is chromosome length. Child A inherits positions 1..k from parent A and positions k+1..L from parent B; child B is the reverse. The hope: if good "building blocks" of the answer sit in different parents, crossover assembles them.
Mutation. With small probability (typically 0.01 to 0.05 per gene), randomly change one symbol in a child to keep the population from collapsing to a single area of the search space. Without mutation, lost alleles cannot return.

Worked example: 8-queens with GA

Start with a population of four chromosomes (small for illustration; production GAs typically use 20-500). Each chromosome is a candidate 8-queens board, written as an 8-digit string in which digit i gives the row of the queen in column i.

Step 1: population & fitness (non-attacking pairs, max 28)

Label	Chromosome	Fitness f_i	p_i = f_i / Σf
A	`2 4 4 1 5 1 2 4`	24	24 / 78 = 0.31 (31%)
B	`3 2 5 4 3 2 1 3`	23	23 / 78 = 0.29 (29%)
C	`2 4 7 4 8 5 5 2`	20	20 / 78 = 0.26 (26%)
D	`3 2 7 5 2 4 1 1`	11	11 / 78 = 0.14 (14%)
Sum Σf		78	1.00

Selection probabilities form a proper distribution: they sum to 1. A and B are roughly twice as likely to be picked as D, which encodes the GA's "selection pressure" toward better candidates.

Step 2: selection (roulette wheel, two pairs picked at random by p_i)

Suppose two spins yield the parent pairs (A, C) and (A, B). (A is fittest so most likely to be chosen; that it appears twice is consistent with its 31% slice.)

Step 3: crossover at a random point

Pick crossover position k = 3 for pair (A, C). Take the first three digits from one parent and the last five from the other.

Parent / child	Digits 1..3	Digits 4..8	Result
Parent A	`2 4 4`	`1 5 1 2 4`	`2 4 4 1 5 1 2 4`
Parent C	`2 4 7`	`4 8 5 5 2`	`2 4 7 4 8 5 5 2`
Child 1 (A then C)	`2 4 4`	`4 8 5 5 2`	`2 4 4 4 8 5 5 2`
Child 2 (C then A)	`2 4 7`	`1 5 1 2 4`	`2 4 7 1 5 1 2 4`

The same operation is run on pair (A, B) with its own random crossover point, producing two more children. The four children replace the current population.

Step 4: mutation (small probability per gene)

Walk through each gene of each child. With mutation probability around 1% per gene, occasionally swap one digit for a random replacement.

Suppose gene 7 of Child 1 mutates from 5 to 7:

Before: 2 4 4 4 8 5 5 2

After: 2 4 4 4 8 5 7 2

Mutation is the only operator that can introduce new alleles. If every member of the population has row 5 in column 2, crossover alone will never produce row 6 in column 2. Mutation can.

Step 5: replace the population, repeat

The four (mutated) children become the new population. Compute fitness for each, then go back to Step 2. The average fitness typically rises generation after generation; the best individual eventually reaches the target.

Stopping criterionA GA terminates on one of: (a) a satisfactory individual, one whose fitness meets the target (for 8-queens, fitness = 28 = 0 attacking pairs); (b) a generation cap, a fixed maximum number of generations; (c) convergence, the population's diversity has collapsed or the best fitness has not improved for some number of generations. R&N and Jarrar both phrase this as "until some individual is fit enough, or enough time has elapsed".

gen 0 · best h = ?

Press Step to advance one generation. Six random 8-queens states are shown as rows. Each generation: pick parents by fitness, splice them, mutate occasionally. The best individuals tend to dominate the population after a few generations.

Trace · generation | best h | avg h | population

Gen	Best	Avg	Operations

GAs have been applied to circuit design (a famous early result: Adrian Thompson's evolved sound discriminator), antenna design (NASA used a GA-designed antenna on the ST5 spacecraft), schedule optimisation, and many problems where the search space is huge and the structure is poorly understood.

Honest assessmentThe theoretical case for genetic algorithms over other local search methods is weak. The schema theorem, the original justification for why crossover should help, turns out to apply only under restrictive assumptions. In practice GAs are useful when the state representation has natural sub-structures that crossover can exploit, but they are not magic, and a well-tuned simulated annealing often beats them.

HC vs SA vs GA · at a glance مقارنة سريعة

Once all three local-search algorithms are on the table, comparison questions tend to pivot on a handful of axes. None of the rows below introduces new content, the answers are all derivable from each algorithm's pseudocode, but lining them up side by side makes the contrasts easy to recall.

Axis	Hill Climbing	Simulated Annealing	Genetic Algorithm
Deterministic?	Yes in its basic form, every step picks the best neighbour (tie-breaking aside)	No, the neighbour is sampled at random, and worse moves are accepted with probability e^ΔE/T	No, selection, crossover point, and mutation all draw on randomness
Can escape local optima?	No, terminates at the first one it finds	Yes, probabilistically, especially while T is still high	Yes, via population diversity, recombination, and mutation
Memory at runtime	O(1) — current state only	O(1) — current state plus a temperature	O(N) — the whole population of N individuals
Best suited for	Smooth landscapes; or with random restarts as a baseline	Landscapes with many local optima, when neighbour evaluation is cheap	Encodings whose substrings carry meaningful sub-solutions crossover can preserve

"Determinism" here refers to each algorithm in its basic form. HC has stochastic variants (random restarts, stochastic HC, first-choice HC) that deliberately inject randomness to mitigate the local-maxima problem.

Stopping criteria, side by side شروط التوقّف

One question every local-search algorithm has to answer: when do we stop? Each algorithm uses a different mix of natural and engineered stop conditions, and an exam answer should name them explicitly.

Algorithm	Natural stop	Engineered safety stops
Hill Climbing	No neighbour strictly better than the current state.	Max steps; max consecutive sideways moves (for the plateau-tolerant variant).
Hill Climbing with random restarts	One of the restarts reaches the target objective value.	Max number of restarts; total step budget across all restarts.
Simulated Annealing	Temperature reaches zero (or some ε floor).	Step budget; target objective reached; no improvement for k iterations.
Genetic Algorithms	An individual reaches the target fitness.	Max generations; population diversity collapses; best fitness flat for k generations.

In practice every implementation combines its natural stop with at least one engineered safety stop so the algorithm always terminates in bounded time even when the natural condition is never met.

When to use local search متى نستخدم البحث المحليّ

Local search is the right tool when:

The path doesn't matter, only the final state.
The state space is huge, too large for systematic search to enumerate.
You are willing to accept a non-optimal but good-enough answer.
You can evaluate a state cheaply (compute its objective quickly).
Memory is constrained.

Local search is the wrong tool when:

The sequence of actions is what you need to return (e.g. driving directions).
The objective is deceptive, the landscape is full of misleading local optima with no signal toward the global one.
You need a provable optimum.

Part Eleven

Culminating Exercise

every algorithm on the same problem

The point of comparing algorithms is to see them disagree. Below are two problems instrumented to make the disagreement vivid: a small weighted graph that exposes the difference between every uninformed and informed algorithm, and an 8-puzzle that contrasts informed search with local search. Run them all and compare what each one returns.

Part A · The same graph, seven algorithms الجزء أ: نفس الرسم، سبع خوارزميّات

A seven-node weighted graph with start S and goal G. Each edge has a step cost; each node has an admissible heuristic h(n) = straight-line distance to G (you can verify admissibility from the diagram). The seven uninformed and informed algorithms all run on this graph; below the diagram you will see a table of their results side by side.

Start S

Goal G

On the optimal path

h = n heuristic value at each node

Comparative results

Click each algorithm to highlight its path on the graph above and see its trace in the panel below. The number of nodes expanded is the count of nodes the algorithm popped from the fringe before terminating (graph-search throughout, FIFO tie-breaking).

Algorithm	Path returned	Path cost	Nodes expanded	Optimal?

Click a row above to see that algorithm's trace

Step	Popped	Visited	Fringe

Discussion Five things to notice from the comparison table:

BFS finds the shortest path in edges (three: S→A→C→G), but it costs 9, not optimal.
UCS finds the cheapest path (cost 7) by accumulating g. It explores more nodes than BFS to do so.
DFS takes whichever branch the successor function visits first. Different orderings give different answers; none is optimal.
Greedy picks by h, ignoring g. It finds a goal quickly but commits to a suboptimal path.
A* matches UCS's optimum but typically explores fewer nodes, the heuristic is doing real work.

Part B · The 8-puzzle, informed versus local الجزء ب: لعبة الثمانية

The 8-puzzle is where systematic and local search part ways. A* with a good heuristic is provably optimal but uses exponential memory. Local search uses no memory but gives up optimality. Below: solve the same scrambled puzzle with four different approaches and compare what they return.

Scrambled start

Goal

1

2

3

4

5

6

7

8

Four solvers

Each solver is run on the same scrambled start. The 8-puzzle is solvable by every algorithm here, but the four return very different things.

Note: A* solvers cap exploration at 30,000 nodes for browser-runtime safety. Local solvers cap at 1000 steps.

Results

Algorithm	Solved?	Moves to solve	Nodes / steps used	Notes
A* with h₁ (misplaced)	Press “Run all four solvers” to populate

What you should see

A* with h₁ and A* with h₂ both return the optimal sequence of moves. h₂ uses fewer nodes (it dominates h₁).
Hill climbing usually gets stuck at a local minimum on this objective. It typically does not solve the puzzle, the 8-puzzle's landscape is full of plateaus.
Random-restart hill climbing solves the puzzle, but does not guarantee an optimal solution path. The "moves to solve" column can be much larger than A*'s.

The lesson: for problems where the path matters, A* with a good heuristic is the right tool. For problems where only the final state matters, local search is much cheaper in memory but you give up the path-length guarantee.

Part Twelve

Comparison & Cheat Sheet

all twelve algorithms, side by side

One table to compare every algorithm in this companion. Use it for revision, not to memorise: the entries should be derivable from how each algorithm chooses what to expand next.

Algorithm	Fringe / structure	Complete?	Optimal?	Time	Space
BFS	FIFO queue	Yes if b finite	If unit costs	O(b^d+1)	O(b^d+1)
UCS	PQ by g(n)	Yes if costs ≥ ε	Yes	O(b^{1+⌈C*/ε⌉})	O(b^{1+⌈C*/ε⌉})
DFS	LIFO stack	No in tree-search; Yes in graph-search on finite spaces	No	O(b^m)	O(b·m)
DLS	LIFO stack, depth cap l	If l ≥ d	No	O(b^l)	O(b·l)
IDS	Repeated DLS	Yes	If unit costs	O(b^d)	O(b·d)
Bidirectional	Two FIFO queues	Yes (both BFS)	If unit costs	O(b^d/2)	O(b^d/2)
Informed
Greedy BFS	PQ by h(n)	No (can loop)	No	O(b^m)	O(b^m)
A*	PQ by f = g + h	Yes	If h admissible	Exponential worst case	Keeps all generated nodes
RBFS	Recursion, with f bound	If h admissible	If h admissible	Often worse than A*	O(b·d)
SMA*	PQ, drops worst on overflow	If memory ≥ d	If memory enough	Depends on budget	Bounded by budget
Local
Hill climbing	Current state only	No (local maxima)	No	Fast in practice; unbounded in worst case	O(1)
Simulated annealing	Current state + temp	Probabilistic, slow schedule	Probabilistic	Fixed by schedule	O(1)
Local beam	k states at once	No	No	k per step	O(k)
Genetic	Population + ops	No	No	Population size × generations	O(N) population

Decision guide

No heuristic, unit costs, shallow goal: BFS if memory permits, else IDS.
No heuristic, varying costs: UCS.
Heuristic available, optimality desired, memory permits: A*.
Heuristic available, optimality desired, memory tight: RBFS or SMA*.
Path doesn't matter, just need a state: Local search. Hill climbing with restarts as baseline; simulated annealing for landscape with many local minima; GA when state representation has sub-structure.

Reference

Glossary

key terms in English and Arabic

State

حالة

A configuration of the world relevant to the problem. Same configuration via different paths is the same state.

Node

عقدة

Bookkeeping record in the search tree containing a state, parent pointer, action, path-cost, and depth.

Successor function

دالّة الخلَف

Given a state, returns the set of <action, resulting state> pairs reachable in one step. Implicitly defines the state space and the search tree.

Goal test

اختبار الهدف

A function that, given a state, returns true if it is a goal. Can be explicit (specific state) or a property (e.g. checkmate).

Path cost · g(n)

تكلفة المسار

The cost of the path from the start to node n. Usually a sum of non-negative step costs. UCS and A* order the fringe by g (and g + h).

Optimal solution

الحلّ الأمثل

A solution whose path cost is the lowest among all solutions to the problem. Distinct from "a solution" of which there may be many.

Generated

مُولَّد

A node is generated the moment the successor function produces it. The total count of generated nodes is what time-complexity formulas count.

Reached

مُكتشف

A state is reached if a node for that state has been generated. In graph-search the reached set blocks duplicate generation; in tree-search there is no such check.

Expanded (visited)

مُوسَّع / مُزار

A node is expanded when its successor function is applied. The number of expansions is the main efficiency metric.

Fringe / frontier / open list

الحافة / المُقدِّمة

The collection of generated but unexpanded nodes. Different orderings give different algorithms.

Closed list / reached set

قائمة المغلَقة

The set of already-expanded nodes. Used by graph-search to recognise repeats and avoid re-expansion.

Search strategy

استراتيجيّة البحث

The rule by which an algorithm picks which fringe node to expand next. Every algorithm in this companion is defined by its strategy.

Branching factor · b

عامل التفرّع

Maximum number of successors any single node has. Formulas like b^d assume a uniform tree as the worst-case upper bound.

Effective branching factor · b*

عامل التفرع الفعّال

The b that a uniform tree of depth d would need to contain the same number of nodes the algorithm actually expanded. Closer to 1 means the algorithm prunes well.

Depth · d, m

العمق

d = depth of the shallowest goal; m = maximum depth of any path in the state space (may be infinite).

Heuristic · h(n)

دالّة استدلاليّة

Estimate of cost from n to the nearest goal. Problem-specific; chosen by the designer.

Evaluation function · f(n)

دالّة التقييم

The key used to order the fringe in best-first search. Greedy uses f = h; A* uses f = g + h.

Admissible

مقبولة

h(n) ≤ h*(n) for every node. Admissible heuristics never overestimate the true remaining cost. Required for A*'s tree-search optimality.

Consistent (monotonic)

متّسقة

h satisfies the triangle inequality: h(n) ≤ c(n, a, n') + h(n'). Consistency implies admissibility. With a consistent h, A*'s graph-search is optimal too.

Dominance

الهيمنة

h₂ dominates h₁ if h₂(n) ≥ h₁(n) everywhere, both admissible. A* with h₂ expands no more nodes than A* with h₁.

Relaxed problem

مسألة مُخفّفة

A problem with fewer restrictions on the actions. The optimal cost of a relaxed problem is an admissible heuristic for the original.

Pathmax

أعظم على المسار

Adjustment used by RBFS to ensure f-values are non-decreasing along any path: f(child) ← max(f(parent), g(child) + h(child)).

Local maximum

القيمة العظمى المحلّيّة

A state whose value is higher than every immediate neighbour but lower than the global maximum. Hill climbing terminates here.

Plateau

هضبة

Flat region of the state-space landscape where every neighbour has the same objective value. No gradient signal.

Shoulder

كتف

Flat region with an exit, one edge of the flat leads to higher terrain. Escapable by sideways moves.

Ridge

تلّ ضيّق

A long narrow crest in ≥ 2D: moving along the crest is uphill, but every single-step move is sideways or downhill. Single-step hill climbing oscillates.

Sideways move

حركة جانبيّة

A neighbour move with the same objective value as the current state. Allowing a bounded number of sideways moves lets hill climbing cross shoulders.

Cooling schedule

جدول التبريد

Function T(t) controlling how the temperature decreases over time in simulated annealing. Slow logarithmic schedules guarantee convergence; fast geometric schedules are typical.

Chromosome

صبغي / كروموسوم

A single candidate solution in a GA, encoded as a fixed-length string of symbols. The GA's analogue of a "state" in hill climbing.

Gene

مُورِّثة / جين

One position within a chromosome. Crossover splices genes between parents; mutation flips one at random with small probability.

Fitness function

دالّة اللياقة

Objective in evolutionary algorithms. For 8-queens, the number of non-attacking pairs of queens. Genetic Algorithms maximise this.

Crossover

التهجين

GA operation: pick a random split point in two parent chromosomes; child A inherits the first half of one parent and the second half of the other. Two children per crossover.

Mutation

الطفرة

GA operation: with small probability per gene, randomly change a symbol in a child. The only operator that can introduce alleles not present in any parent.

Lecture slides · Mustafa Jarrar (Birzeit)

The Problem-Solving Agent

1. Goal formulation

2. Problem formulation

3. Search

4. Execution

Where search shows up in practice

Problem Formulation

The Six Components المُكوِّنات الستة

State

Initial State

Successor Function

Goal Test

Path Cost

Solution

Worked Example · Romania مثال: التنقّل في رومانيا

Worked Example · The 8-Puzzle مثال: لعبة الثمانية

Start State

Goal State

Formulation

Real-world problem formulations أمثلة عمليّة من الواقع

Travelling Salesperson (TSP)

VLSI layout

Robot navigation

Automatic assembly

The Search Tree

States versus Nodes الحالات مقابل العقد

Expansion توسيع العقدة

Search terminology, in one place مفردات البحث

Generated

Reached

Expanded (visited)

Fringe (frontier, open list)

Closed list (reached set)

Search strategy

Tree-Search versus Graph-Search

Example 1 · A simple cycle المثال الأول: حلقة بسيطة

The state space (graph)

The search tree (tree-search)

Example 2 · The diamond المثال الثاني: المعيّن

The state space (graph)

The search tree

Example 3 · A Romania sub-graph المثال الثالث: جزء من خريطة رومانيا

The state space (graph)

The search tree from Arad

The graph-search fix حلّ البحث في الرسم البياني

The General Search Algorithm

The skeleton الهيكل العام

The fringe (frontier) الحافة / المُقدِّمة

Evaluating a Search Strategy

1. Completeness الاكتمال

2. Optimality الأمثليّة

3. Time complexity التعقيد الزمنيّ

4. Space complexity التعقيد المكانيّ

Uninformed Search

1. Breadth-First Search · BFS البحث بأولويّة الاتّساع

Trace · step | popped | fringe (front to back)

Why BFS has these properties

2. Uniform-Cost Search · UCS البحث بالتكلفة المنتظمة

Trace · step | popped | fringe sorted by g(n)

Why UCS has these properties

3. Depth-First Search · DFS البحث بأولويّة العمق

Trace · step | popped | fringe (top of stack on the right)

Why DFS has these properties

4. Depth-Limited Search · DLS البحث بالعمق المحدود

Trace · step | popped at depth | fringe

Why DLS has these properties

5. Iterative Deepening Search · IDS البحث بالتعميق التكراريّ

Trace · iteration | popped | fringe

Why IDS has these properties

6. Bidirectional Search البحث ثنائيّ الاتّجاه

Trace · step | direction | both fringes

Why bidirectional search has these properties

Informed Search

The heuristic function · h(n) دالّة الاستدلال

Best-First Search · the general idea البحث بأولويّة الأفضل

1. Greedy Best-First Search البحث الجشِع بأولويّة الأفضل

Trace · step | popped | fringe with h(n)

Why Greedy Best-First has these properties

2. A* Search خوارزميّة A*