BTW reading over this thread it seems worth mentioning some previous research into reflowing scanned text for small devices:
http://pubs.iupr.org/DATA/2002-breuel-wdabook.pdf
In that paper they first identify text line segments, images, and column boundaries, similarly to this program, but then the text is segmented into words. Once you've broken the document down into word-sized chunks and know how they aggregate into paragraphs and columns theres numerous ways to reflow the document; one they describe is embedding all the images into html so the scanned document now reflows when you resize the browser window. They go on to talk about how to output this most compactly for a PDA. Interesting stuff.