View Single Post
Old 06-07-2009, 10:03 PM   #1
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Removing Line-breaks / Preserving Paragraphs

Please find attached repartee, a python script that--I believe--should do a fairly good job of automatically removing linebreaks without interfering with paragraph breaks.

I just finished throwing it together, so it doubtless leaves much to be desired. However I would be grateful if people could either test it or point me to some unorthodoxly line-broken/paragraph-broken files upon which I could try the program myself.

The script doesn't touch the input file (unless you purposely specify the input file's name as also the output file) and is programmed not to output anything if it doesn't think it can tell line-breaks apart from paragraph-breaks.

If you find a file that the script should fix (i.e.: it has both line-breaks and paragraph breaks), but it refuses, saying "Unable to find a clear and/or consistent line break / paragraph break pattern.", please send the file (or a portion thereof) my way for analysis.

Keep in mind though that the script is meant to be used on full size plaintext novels or reasonably long short stories. It is more likely to break with very short pieces of text, almost certainly won't do anything useful with flash fiction, and may behave erratically with complexly formatted (i.e.: language text book, and other similarly non-novel type of) text files.

- Ahi

Ps.: In particular, I would be grateful, Gideon, if you tried it on the file you recently had trouble with and let me know the results.
Attached Files
File Type: zip repartee.zip (1.4 KB, 268 views)
ahi is offline   Reply With Quote