Lab 2: Breadth-First Search
Objective
To create Python script to execute breadth first search algorithm.
Problem to be solved
The search problem we will focus on during this lab is Nick’s route-finding problem in Romania, starting in Arad to reach Bucharest. The road map of Romania, which is the state space of the problem is given as follows:
Translating the state space
-
First we need to define the state space in our problem.
-
The important information from the state space is the connections between states and the step costs between connected states.
-
Notice that in this problem the connections are reversible. Therefore in our state space the connection from Arad to Zerind and the connection from Zerind to Arad are identical, hence only one instance of that connection is needed.
-
The most straightforward way of defining the state space is by using a nested array, in which each inner array consists of the two connected states and its cost. Open a script file and define the variable
state_space
. The following code shows the first three elements in the nested array. Complete the definition of the variable by referring to the state space provided.state_space = [ ['Arad', 'Zerind', 75], ['Arad', 'Timisoara', 118], ['Timisoara', 'Lugoj', 111], ... ]
-
Define two variables
initial_state
andgoal_state
to hold the information from the problem.initial_state = 'Arad' goal_state = 'Bucharest'
Actions and transition model
-
Actions and transition model provide us the way to identify the children of a node in a search tree.
-
In this problem, the actions and transition model are straightforward. The actions are limited by the states connected to node and the transition model is directly defined by the action. Therefore we can create a function to search through the state space to find the children of a node, which would be suffice as actions and transition model.
-
To achieve this, we will loop through the
state_space
to check for any connection linked to the node, the other state on the connection will be the child of the node in the search tree. In the same script file, define the following function:def expandAndReturnChildren(state_space, node): children = [] for [m,n,c] in state_space: if m == node: children.append(n) elif n == node: children.append(m) return children
-
This function provides the
children
of anode
given thestate_space
of the problem.
Breadth-first search algorithm
-
Next we will create a function to execute the search algorithm. The first algorithm we will use is the breadth-first search (BFS).
-
As we want to separate the problem definition from the algorithm definition, the algorithm function will have the inputs of the state space, initial state and goal state.
def bfs(state_space, initial_state, goal_state):
-
In this function, we need two empty arrays to store the frontier and the explored states.
def bfs(state_space, initial_state, goal_state): frontier = [] explored = []
-
At initial stage, the initial state is our frontier.
def bfs(state_space, initial_state, goal_state): frontier = [] explored = [] frontier.append(initial_state)
-
When we are generating the search tree, we will continually expanding the first node (FIFO) in the frontier until we generate a node with goal state. Therefore we need to have a loop to repeatedly expanding the first node in the frontier until the goal node is generated.
def bfs(state_space, initial_state, goal_state): frontier = [] explored = [] found_goal = False frontier.append(initial_state) while not found_goal: ...
-
In the loop we will first expand the first node in the frontier to obtain the children, and move the expanded node from frontier to the explored set.
while not found_goal: # expand the first in the frontier children = expandAndReturnChildren(state_space, frontier[0]) # copy the node to the explored set explored.append(frontier[0]) # remove the expanded node from the frontier del frontier[0]
-
Check if the children generated should be added to the frontier or discarded. If any child is expanded previously, i.e. in the explored set, or if it is generated previously but not expanded yet, i.e. in the frontier, then it should be discarded. Else, it should be added to the frontier.
del frontier[0] # loop through the children for child in children: # check if a node was expanded or generated previously if not (child in explored) and not (child in frontier): # add child to the frontier frontier.append(child)
-
Before a child is added to the frontier, it should be tested for goal. If it has the goal state, the BFS algorithm should in fact be terminated and return the solution.
if not(child in explored) and not (child in frontier): # goal test if child == goal_state: found_goal = True break # add child to the frontiers frontier.append(child)
Oops, something is not right
-
Now, if you are following, you may notice that we have a way to end the algorithm but we have failed to store our solution.
-
One way to store the solution while generating the search tree is to store the frontier as the paths from the initial state to the leaf nodes instead of just the leaf nodes. Therefore we would be able to retain the memory of the explored section of the search tree.
-
First we need to modify the
expandAndReturnChildren
function.4. Thedef expandAndReturnChildren(state_space, path_to_leaf_node): children = [] for [m,n,c] in state_space: if m == path_to_leaf_node[-1]: children.append(path_to_leaf_node + [n]) elif n == path_to_leaf_node[-1]: children.append(path_to_leaf_node + [m]) return children
bfs
function is also modified to accommodate to this change.def bfs(state_space, initial_state, goal_state): frontier = [] explored = [] found_goal = False solution = [] frontier.append([initial_state]) while not found_goal: # expand the first in the frontier children = expandAndReturnChildren(state_space, frontier[0]) # copy the node to the explored set explored.append(frontier[0][-1]) # remove the expanded frontier del frontier[0] # loop through the children for child in children: # check if a node was expanded or generated previously if not (child[-1] in explored) and not (child[-1] in [f[-1] for f in frontier]): # goal test if child[-1] == goal_state: found_goal = True solution = child # add children to the frontier frontier.append(child) print("Explored: ") print(explored) print("Frontier:") for f in frontier: print(f) print("Children: ") print(children) print("") return solution
-
Noted that in line
12
we are only saving the leaf node that has just been expanded to the explored set since the information of the path is unnecessary to be stored in the explored set. -
In line
18
,[f[-1] for f in frontier]
is used to obtain a list of the leaf nodes in the frontier. This is the pythonic way of extracting from a list to form another list. The one-liner is equivalent toleaf_nodes = [] for f in frontier: leaf_nodes.append(f[-1])
f[-1]
is used since currently we are saving the path from initial state to the respective leaf node in the frontier.
Running the algorithm
-
Now we have two functions
expandAndReturnChildren
andbfs
, alongside with the variablesstate_space
,initial_state
, andgoal_state
. -
To run a script to execute the
bfs
function, we can have the script file structured as such:def expandAndReturnChildren(...): ... def bfs(...): ... state_space = [ ... ] initial_state = 'Arad' goal_state = 'Bucharest' print('Solution: ' + str(bfs(state_space, initial_state, goal_state)))
-
Beware that in Python, a
.py
file can also be used to define a Python library/module. To prevent the commands outside the function to be executed when the file is used as a library instead of a script, we can implement an extra condition check.def expandAndReturnChildren(...): ... def bfs(...): ... if __name__ == '__main__': state_space = [ ... ] initial_state = 'Arad' goal_state = 'Bucharest' print('Solution: ' + str(bfs(state_space, initial_state, goal_state)))
-
__name__
is a special variable in Python that evaluates to the name of the current module. This variable has the value of'__main__'
if it's called as the main program rather than a module or library. -
This part is essentially the
main
function in other programming languages like C++ and Java. -
Compile the program and resolve any error.
Redundant storing of states?
- When you run the script from previous section, you may wonder, if there is a more efficient way of memory usage instead of saving the path to each leaf node in the frontier. Currently there is a lot of redundant states being stored in the variable
frontier
.
Using a class
-
The alternative and more space-efficient way would be to declare each node in the search tree to be an object that stores its name, parent, and children. To do so, we need to first define a class.
class Node: def __init__(self, state=None, parent=None): self.state = state self.parent = parent self.children = [] def addChildren(self, children): self.children.extend(children)
In Python, to define a class, the class needs to have the function
__init__
with at least one inputself
. This function is called when an object is initiated with this class. The internal variableself
defines the properties of the object. The input arguments apart fromself
for the function__init__
are the parameters to be passed when initiating a new object.In a class, additional functions can also be defined, and these will be the methods for the object of this class.
-
An example of initiating a new object for this class and use the function is:
a_distinct_node = Node('state name', 'parent of this node') a_distinct_node.addChildren(['child 1', 'child 2'])
-
With the new class, we can update the function
expandAndReturnChildren
to return the children as a list ofNode
objects.def expandAndReturnChildren(state_space, node): children = [] for [m,n,c] in state_space: if m == node.state: children.append(Node(n, node.state)) elif n == node.state: children.append(Node(m, node.state)) return children
-
With the availability of the
Node
class, we can save the nodes asNode
objects in the frontier instead of paths to the leaf nodes. When the goal state is found, the goal node is used to trace back to its predecessor (a.k.a. parent, which will be now in the explored set) which is accessible using the property.parent
of the goal node. Then the predecessor to the parent of the goal node can be traced similarly. This process is thus repeated until the initial state (with no parent) to obtain the solution and its cost.def bfs(state_space, initial_state, goal_state): frontier = [] explored = [] found_goal = False goalie = Node() solution = [] # add initial state to frontier frontier.append(Node(initial_state, None)) while not found_goal: # expand the first in the frontier children = expandAndReturnChildren(state_space, frontier[0]) # add children list to the expanded node frontier[0].addChildren(children) # add to the explored list explored.append(frontier[0]) # remove the expanded frontier del frontier[0] # add children to the frontier for child in children: # check if a node was expanded or generated previously if not (child.state in [e.state for e in explored]) and not (child.state in [f.state for f in frontier]): # goal test if child.state == goal_state: found_goal = True goalie = child frontier.append(child) print("Explored:", [e.state for e in explored]) print("Frontier:", [f.state for f in frontier]) print("Children:", [c.state for c in children]) print("") solution = [goalie.state] path_cost = 0 while goalie.parent is not None: solution.insert(0, goalie.parent) for e in explored: if e.state == goalie.parent: path_cost += getCost(state_space, e.state, goalie.state) goalie = e break return solution, path_cost def getCost(state_space, state0, state1): for [m,n,c] in state_space: if [state0,state1] == [m,n] or [state1,state0] == [m,n]: return c
-
The compiled program will be structured as such. Compile and run the program.
class Node: ... def expandAndReturnChildren(...): ... def bfs(...): ... def getCost(...): ... if __name__ == '__main__': state_space = [ ... ] initial_state = 'Arad' goal_state = 'Bucharest' [solution, cost] = bfs(state_space, initial_state, goal_state) print("Solution:", solution) print("Path Cost:", cost)
Preparation for next week
-
Fully understand the code as you will have to modify the code for other problem/search algorithms in next labs.
-
How would you modify the code to run other uninformed search algorithms such as uniform-cost search, depth-first search, etc.? Which part(s) of the code do you need to modify? Think about it before you work on that in Lab 3 next week.