Quote:
Originally Posted by siulayhumga
I am trying to convert a lot of old text file to epub so I can read it in my Sony 505.
Most of them was scan in at mid 90s and have a lot of formatting errors like extra line break in the middle of the line. Guess the OCR technology for PC was weak back than.
I don't want to use a text edtior to remove all the CRLF and line wrap everything coz this will give me a "wall of text".
Anyone know a program which will do this kind of text paragraph "reflow"?
Thanks
|
If the file is reasonably regular, other than for the erroneous linebreaks in the middle of paragraph, my very very under constructed python script might be able to get it in slightly better shape:
Try running it with:
pacify.py -i filename.txt -p
or with:
pacify.py -i filename.txt -rp
It outputs results into output.txt in the same directory wherefrom you run it.
- Ahi