Getting Stuck — When Sorting Became a Trap This is another entry in my Getting Stuck series. The title says it: I get stuck, I chase red herrings, I learn, and I climb out. This time the trap was a classical one: sort a file that’s larger than RAM . On paper, “just sort” sounds trivial. In practice, it became an odyssey through I/O, memory, representation, parallelism, and algorithm engineering. Problem statement (the kind that looks easy) You have a file with N integers, one per line. N is huge (tens or hundreds of millions). You cannot load all of them into memory. Produce a sorted file (same format). This is textbook external sorting territory, but I wanted to implement it myself for two reasons: (1) to understand performance pitfalls in practice, (2) because I enjoy learning the gritty details. Predictably, I got stuck several times. First attempts — tiny code, big assumptions I started with the naive in-memory approach (because why not check the obvious): # nai...
A curious mind exploring the beautiful world