Getting Stuck: The Case of the Elusive Cycle
This is another chapter in my Getting Stuck series, where I document the messy, often frustrating process of solving computer science problems. This episode is about graph theory — and how I got trapped trying to detect cycles in a directed graph.
Sounds easy, right? Well… let’s see.
Step 1: The Obvious Attempt
The problem:
Given a directed graph with
nnodes andmedges, detect if it contains a cycle.
Immediately, I thought: Easy, just do a depth-first search (DFS) and track visited nodes.
So I wrote:
from collections import defaultdict
class Graph:
def __init__(self, n):
self.n = n
self.adj = defaultdict(list)
def add_edge(self, u, v):
self.adj[u].append(v)
def has_cycle(graph):
visited = [False] * graph.n
def dfs(u):
if visited[u]:
return True
visited[u] = True
for v in graph.adj[u]:
if dfs(v):
return True
visited[u] = False
return False
for u in range(graph.n):
if dfs(u):
return True
return False
I tested it on a simple cycle:
g = Graph(3)
g.add_edge(0, 1)
g.add_edge(1, 2)
g.add_edge(2, 0)
print(has_cycle(g)) # Expected: True
It worked. Problem solved? Not quite.
Step 2: The Counterexample Disaster
Later I tested on a graph with two disjoint components:
g = Graph(4)
g.add_edge(0, 1)
g.add_edge(2, 3)
print(has_cycle(g)) # Expected: False
My function returned True.
I stared at the code for half an hour. The bug was subtle: I was reusing the visited array incorrectly. I needed two states — one for "visited at all" and another for "currently in recursion stack."
This was my first real stuck moment.
Step 3: First Red Herring — Overengineering
Instead of fixing the visited-state bug, I thought: Why not avoid DFS entirely? Let’s detect cycles with topological sorting.
from collections import deque
def has_cycle_kahn(graph):
indegree = [0] * graph.n
for u in graph.adj:
for v in graph.adj[u]:
indegree[v] += 1
q = deque([u for u in range(graph.n) if indegree[u] == 0])
count = 0
while q:
u = q.popleft()
count += 1
for v in graph.adj[u]:
indegree[v] -= 1
if indegree[v] == 0:
q.append(v)
return count != graph.n
This worked for many cases — but when I tested with self-loops, things got weird.
g = Graph(1)
g.add_edge(0, 0)
print(has_cycle_kahn(g)) # Expected: True, got False
I had forgotten that Kahn’s algorithm assumes no self-loops in its initial indegree reduction. Another dead end.
Step 4: Second Red Herring — Tarjan’s Algorithm Detour
At this point, I remembered Tarjan’s algorithm for strongly connected components (SCCs). Any SCC with more than one node (or a self-loop) must contain a cycle.
I dived in and implemented it:
def tarjans_scc(graph):
index = 0
stack = []
on_stack = [False] * graph.n
indices = [-1] * graph.n
lowlink = [-1] * graph.n
sccs = []
def strongconnect(v):
nonlocal index
indices[v] = index
lowlink[v] = index
index += 1
stack.append(v)
on_stack[v] = True
for w in graph.adj[v]:
if indices[w] == -1:
strongconnect(w)
lowlink[v] = min(lowlink[v], lowlink[w])
elif on_stack[w]:
lowlink[v] = min(lowlink[v], indices[w])
if lowlink[v] == indices[v]:
scc = []
while True:
w = stack.pop()
on_stack[w] = False
scc.append(w)
if w == v:
break
sccs.append(scc)
for v in range(graph.n):
if indices[v] == -1:
strongconnect(v)
return sccs
def has_cycle_tarjan(graph):
sccs = tarjans_scc(graph)
return any(len(scc) > 1 or (len(scc) == 1 and list(graph.adj[scc[0]]).count(scc[0]) > 0)
for scc in sccs)
This worked — but it felt like bringing a tank to kill a mosquito. I only needed cycle detection, not a full SCC decomposition. Overkill. Another trap.
Step 5: The Real Fix
Finally, I went back to DFS. I realized I needed two arrays:
-
visited[u] = True→ node fully processed -
rec_stack[u] = True→ node is in current DFS path
Corrected code:
def has_cycle(graph):
visited = [False] * graph.n
rec_stack = [False] * graph.n
def dfs(u):
visited[u] = True
rec_stack[u] = True
for v in graph.adj[u]:
if not visited[v] and dfs(v):
return True
elif rec_stack[v]:
return True
rec_stack[u] = False
return False
for u in range(graph.n):
if not visited[u] and dfs(u):
return True
return False
Now it worked for all cases: self-loops, disconnected graphs, cycles hidden deep inside.
Step 6: Reflection
This problem was deceptively hard. Along the way, I:
-
Broke the logic by conflating “visited at all” with “currently in recursion.”
-
Got lost in topological sorting quirks with self-loops.
-
Overengineered with Tarjan’s SCC.
-
Finally returned to a corrected DFS with two arrays.
It felt like circling the globe to reach the house next door.
Lessons
-
Don’t overcomplicate when the bug is simple. My original DFS was almost right; I just needed a recursion stack array.
-
Red herrings teach depth. I now know three different ways to detect cycles in directed graphs.
-
Correctness before performance. I obsessed over efficiency before getting the logic right.
And that’s another Getting Stuck. What looked trivial — cycle detection — turned into hours of debugging, rewrites, and algorithm rabbit holes. But I came out with a stronger grasp of graph theory, and that made the pain worth it.
Comments
Post a Comment