Getting Stuck: The Threads That Wouldn’t Behave
Concurrency has always fascinated me. The idea that multiple tasks can run “simultaneously” — or at least give the illusion of simultaneity — feels like wizardry. But when I tried to actually implement something non-trivial, I got stuck in ways I never expected.
The Task
I wanted to simulate a bank with multiple accounts. Each account supports deposit and withdrawal. Multiple users (threads) perform transactions at the same time.
Sounds simple.
Attempt 1: Naive Multithreading
I started with Python’s threading module:
import threading
class Account:
def __init__(self, balance=0):
self.balance = balance
def deposit(self, amount):
self.balance += amount
def withdraw(self, amount):
if self.balance >= amount:
self.balance -= amount
return True
return False
account = Account(100)
def transaction():
for _ in range(1000):
account.deposit(1)
account.withdraw(1)
threads = [threading.Thread(target=transaction) for _ in range(10)]
for t in threads: t.start()
for t in threads: t.join()
print("Final balance:", account.balance)
My reasoning:
-
Ten threads each deposit and withdraw the same amount.
-
Net effect: balance should stay 100.
But when I ran it, sometimes the final balance was 100, sometimes 98, sometimes 102.
I was baffled.
Attempt 2: Suspecting Python’s GIL
Python has a Global Interpreter Lock (GIL), so I thought: Maybe the GIL prevents true concurrency. That must be the issue.
Wrong. The GIL only ensures that bytecode execution is atomic at the interpreter level, not that higher-level operations (self.balance += amount) are atomic.
I realized self.balance += amount actually decomposes into multiple steps:
-
Load
self.balance. -
Add
amount. -
Store result back into
self.balance.
If two threads interleave here, one update can overwrite another. Classic race condition.
Attempt 3: Locks Everywhere
The standard fix is a lock. So I added one:
class Account:
def __init__(self, balance=0):
self.balance = balance
self.lock = threading.Lock()
def deposit(self, amount):
with self.lock:
self.balance += amount
def withdraw(self, amount):
with self.lock:
if self.balance >= amount:
self.balance -= amount
return True
return False
Now the final balance was always 100.
Victory? Not quite.
The Red Herring: Over-Locking
I got overconfident. I extended my simulation to include transfers between accounts.
def transfer(a1, a2, amount):
with a1.lock:
if a1.withdraw(amount):
a2.deposit(amount)
I ran a workload of random transfers between multiple accounts.
The program froze.
I had created a deadlock.
Why? Because if one thread locked a1 while another locked a2, and both tried to transfer in opposite directions, each would wait forever for the other’s lock.
Attempt 4: Global Lock (and Its Failure)
My “quick fix” was to add a global lock, forcing all accounts to be locked together:
global_lock = threading.Lock()
def transfer(a1, a2, amount):
with global_lock:
if a1.withdraw(amount):
a2.deposit(amount)
No deadlocks now. But the system slowed to a crawl. The whole point of concurrency was lost — I had serialized everything.
I was stuck again.
Attempt 5: Ordering Locks
The standard fix in operating systems textbooks is: always acquire locks in a consistent global order.
So I wrote:
def transfer(a1, a2, amount):
first, second = sorted([a1, a2], key=id)
with first.lock:
with second.lock:
if a1.withdraw(amount):
a2.deposit(amount)
Now deadlocks were gone, and concurrency was preserved.
At last, it worked.
Reflection
This “Getting Stuck” was especially humbling because:
-
At first, I didn’t even see the race condition.
-
Then I underestimated the complexity of locks.
-
I went from bugs → race conditions → deadlocks → performance collapse.
-
Only then did I rediscover the OS principle of lock ordering.
Concurrency problems look innocent — just a few lines of code. But the number of ways to go wrong is staggering. And yet, getting stuck here taught me more than any textbook could.
Comments
Post a Comment