View Single Post
Old 10-12-2009, 06:18 AM   #7
tbergman
Connoisseur
tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.tbergman knows the square root of minus one.
 
Posts: 59
Karma: 7642
Join Date: Jul 2009
Device: Kindle
When repairing files like this, I've found a technique that works fairly well. This technique depend on the theory that a proper paragraph will always end with either a period, a question mark, or a quote.

For the purposes of this discussion, I'm illustrating a paragraph mark as ^p. Of course the file may have carriage returns or line breaks, you'll need to know which to do the repair.

With that logic in mind, and using your favorite editor, replace all instances of .^p with some marker, I usually use [.] Do the same with "^p and ?^p.
Now replace all the ^p left in the doc with either a space or with nothing. The choice depends on if you need a space when the lines get joined.
Now just replace [.] with .^p , ["] with "^p etc.

While you sometimes get an extra paragraph due to the fact that some lines may have ended with a period and were not originally the end of a paragraph, this technique make the document quite readable in my experience.

Tom
tbergman is offline   Reply With Quote