Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions non-heuristic-BeFS-lab4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
#non-heuristic BeFS or UCS practical lab assignment

# City mapping
cities = {
0: "Chicago",
1: "Detroit",
2: "Cleveland",
3: "Indianapolis",
4: "Columbus",
5: "Pittsburgh",
6: "Buffalo",
7: "Syracuse",
8: "New York",
9: "Philadelphia",
10: "Baltimore",
11: "Boston",
12: "Providence",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are extensive comments regarding inconsistencies or questions about the edge data (e.g., "Chi 345 mismatch? Q1.py says Cle-Chi 345, Chi-Cle 354? Asymmetric? Map is usually symmetric."). While it's good to note potential issues during development, it's best practice to resolve these data discrepancies and present a clean, verified set of edges in the final code.

Consider clarifying the source of truth for the edge weights or ensuring that the edges_clean list is the definitive, corrected dataset, and remove the commented-out edges list entirely to avoid confusion.

13: "Portland"
}

n = 14

# adjacency matrix
adj = [[0 for _ in range(n)] for _ in range(n)]

edges = [
(0,1,283),(0,2,354),(0,3,182), # Chicago: Det 283, Cle 354 (corrected from 345 in snippet to match prompt data), Ind 182
(1,2,169),(1,6,256), # Detroit: Cle 169, Buf 256
(2,6,189),(2,5,134),(2,4,144), # Cleveland: Buf 189, Pit 134, Col 144
(3,4,176), # Indy: Col 176
(4,5,185), # Col: Pit 185
(5,6,215),(5,9,305),(5,10,247),# Pit: Buf 215, Phi 305, Bal 247
(6,7,150), # Buf: Syr 150
(7,8,254),(7,11,312), # Syr: NY 254, Bos 312
(8,9,97),(8,12,181),(8,11,215),# NY: Phi 97, Pro 181, Bos 215 (Added Bos connection)
(9,10,101), # Phi: Bal 101 (Removed NY double)
(11,12,50),(11,13,107), # Bos: Pro 50, Por 107
(12,8,181) # Pro: NY 181 (Already covered)
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adj adjacency matrix is initialized twice. The first initialization on line 40 is redundant and can be removed, as it's re-initialized immediately after the edges and edges_clean definitions.

--- a/non-heuristic-BeFS-lab4.py
+++ b/non-heuristic-BeFS-lab4.py
@@ -37,9 +37,6 @@
     13: "Portland"
 }
 
-n = 14
-
-# adjacency matrix
-adj = [[0 for _ in range(n)] for _ in range(n)]
-
 edges = [
     (0,1,283),(0,2,354),(0,3,182), # Chicago: Det 283, Cle 354 (corrected from 345 in snippet to match prompt data), Ind 182
     (1,2,169),(1,6,256),           # Detroit: Cle 169, Buf 256

edges_clean = [
(0,1,283), (0,2,354), (0,3,182), # Chi: Det 283, Cle 354, Ind 182
(1,6,256), (1,2,169), # Det: Buf 256, Cle 169
(2,6,189), (2,5,134), (2,4,144), # Cle: Buf 189, Pit 134, Col 144 (Chi 345 mismatch? Q1.py says Cle-Chi 345, Chi-Cle 354? Asymmetric? Map is usually symmetric. I'll use 354 as in snippet or check LAB-1 dict 354)
(3,4,176), # Ind: Col 176
(4,5,185), # Col: Pit 185
(5,6,215), (5,9,305), (5,10,247), # Pit: Buf 215, Phi 305, Bal 247
(6,7,150), # Buf: Syr 150
(7,8,254), (7,11,312), # Syr: NY 254, Bos 312
(8,9,97), (8,12,181), (8,11,215), # NY: Phi 97, Pro 181, Bos 215
(9,10,101), # Phi: Bal 101
(11,12,50), (11,13,107) # Bos: Pro 50, Por 107

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action field in the Node class is set to s_prime, which is the state of the child node. This makes action redundant as it holds the same information as the state of the child node. For path reconstruction, node.parent and node.state are sufficient.

While it doesn't cause a bug, it can be confusing. If action is meant to describe how we transitioned from parent.state to node.state, it might be better to remove it if unused, or make its purpose clearer (e.g., if it were a specific type of 'move' or 'edge label' beyond just the destination city).

]

# Clear adj
adj = [[0 for _ in range(n)] for _ in range(n)]

for u, v, w in edges_clean:
adj[u][v] = w
adj[v][u] = w

class Node:
def __init__(self, state, parent=None, action=None, path_cost=0):
self.state = state # Current City Index
self.parent = parent # Parent Node
self.action = action # Next City Index
self.path_cost = path_cost # Total Distance from start node to current node

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PriorityQueue implementation is inefficient. The add method sorts the entire list (O(N log N)), and the pop method uses list.pop(0) (O(N)). For a graph with V vertices and E edges, this can lead to a time complexity significantly worse than optimal, especially for dense graphs or many expansions.

Python's built-in heapq module provides an efficient implementation of a min-heap, which is suitable for a priority queue. Using heapq.heappush and heapq.heappop would reduce the complexity of add and pop to O(log N), greatly improving performance.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Inefficient Priority Queue Implementation

The current PriorityQueue implementation has significant performance drawbacks, especially for larger graphs or more complex search problems.

  1. add method: self.data.sort(key=self.f) sorts the entire list every time a new node is added. This operation has a time complexity of O(N log N), where N is the number of elements in the queue. For a priority queue, insertion should ideally be O(log N).
  2. pop method: self.data.pop(0) removes the first element from a list, which requires shifting all subsequent elements. This operation has a time complexity of O(N).

Combined, these lead to a much slower search algorithm than necessary. Python's built-in heapq module provides an efficient min-heap implementation, which is ideal for priority queues. Using heapq.heappush for add and heapq.heappop for pop would bring the complexity down to O(log N) for both operations, significantly improving performance.

Consider replacing the custom PriorityQueue with heapq for better scalability and efficiency.


class PriorityQueue:
def __init__(self, f):
self.data = []
self.f = f

def add(self, node):
self.data.append(node)
self.data.sort(key=self.f) # Sort by f(node) = path_cost

def pop(self):
return self.data.pop(0)

def top(self):
return self.data[0]

def is_empty(self):
return len(self.data) == 0

class Problem:
def __init__(self, initial, goal):
self.INITIAL = initial # Start City Index
self.goal = goal # Goal City Index

def is_goal(self, state):
return state == self.goal

def ACTIONS(self, state):
actions = []
for i in range(n):
if adj[state][i] != 0:
actions.append(i)
return actions # List of next city indices

def RESULT(self, state, action):
return action # In this graph, action is the next state index

def ACTION_COST(self, s, action, s_prime):
return adj[s][s_prime]

def EXPAND(problem, node):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainability/Clarity: Ambiguous action variable

The action variable in the Node class and Problem methods (specifically RESULT) is defined as the "Next City Index" or the action to take. However, in RESULT(self, state, action), it directly returns action, meaning action is the s_prime (the next state). While functionally correct for this specific graph representation where an 'action' to move to a city is simply the city itself, it can be a bit confusing.

For improved clarity and maintainability, especially if the problem domain were to evolve (e.g., actions could be 'take bus', 'fly', etc., with different costs/results), it might be clearer to explicitly separate the concept of an action from the state it results in. For this simple graph, it's acceptable, but it's a point to be aware of for future extensions.

s = node.state
for action in problem.ACTIONS(s):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BEST_FIRST_SEARCH implementation, due to the custom PriorityQueue not supporting efficient updates or removal of existing elements, can lead to duplicate nodes (representing the same state but with different path_costs) being present in the frontier. While the reached array ensures that only the path with the lowest cost is eventually stored and used for path reconstruction, the algorithm might still extract and expand suboptimal duplicate nodes from the frontier before the optimal one is processed.

This is a common inefficiency in simpler UCS implementations. An optimized approach would involve using a heapq (as suggested earlier) and a mechanism to either update the priority of a node already in the frontier or mark older entries as invalid so they are skipped when popped.

s_prime = problem.RESULT(s, action)
cost = node.path_cost + problem.ACTION_COST(s, action, s_prime)
yield Node(state=s_prime, parent=node, action=action, path_cost=cost)

def BEST_FIRST_SEARCH(problem, f):
node = Node(problem.INITIAL) # Start Node

frontier = PriorityQueue(f)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Inefficiency: Duplicate adj matrix initialization

The adj matrix is initialized twice:

  1. adj = [[0 for _ in range(n)] for _ in range(n)] on line 21.
  2. adj = [[0 for _ in range(n)] for _ in range(n)] on line 51.

The first initialization is immediately overwritten by the second one after the edges and edges_clean lists are defined. While this doesn't cause a bug, it's redundant and can be removed, making the code slightly cleaner.

frontier.add(node)

# reach dict stores state -> node
reached = [None] * n
reached[problem.INITIAL] = node
explored = 0

while not frontier.is_empty():
node = frontier.pop()
explored += 1

if problem.is_goal(node.state):
return node, explored, reached

for child in EXPAND(problem, node):
s = child.state
if reached[s] is None or child.path_cost < reached[s].path_cost:
reached[s] = child
frontier.add(child)

return None, explored, reached

# Path Cost (UCS)
# Using path cost as per the snippet's logic f(node) => node.path_cost (Uniform Cost Search)
def f(node):
return node.path_cost

def get_path(node):
path = []
while node:
path.append(cities[node.state])
node = node.parent
return path[::-1]

if __name__ == "__main__":
# Driver code
start = 7 # Syracuse
goal = 0 # Chicago

problem = Problem(start, goal)
solution, explored_count, reached_table = BEST_FIRST_SEARCH(problem, f)

if solution:
print("Uniform Cost Search Path:")
print(" -> ".join(get_path(solution)))
print("Total cost:", solution.path_cost)
print("Number of paths explored:", explored_count)

print("\nLookup Table (Reached States):")
print(f"{'City Index':<12} | {'City Name':<15} | {'Cost':<6} | {'Parent'}")
print("-" * 55)
for i in range(len(reached_table)):
node = reached_table[i]
city_name = cities[i]
if node:
parent_name = cities[node.parent.state] if node.parent else "None"
print(f"{i:<12} | {city_name:<15} | {node.path_cost:<6} | {parent_name}")
else:
print(f"{i:<12} | {city_name:<15} | {'None':<6} | -")
else:
print("No path found.")