In the Post-OCR procedure of my add-in this is getting handled. It tries to be smart at it as well. It can happen that a line ends with a period but that it is not the last line of the paragraph. In those cases the procedure will try to check if the first word of the next line would fit behind the period without overrunning the line (I hope I make sense here). If it fits, the line is probably the end of a paragraph. If not, than the paragraph continues at the next line.
After this usually a couple of unknowns are still present (e.g. a heading usually does not end with a period), With the Search&Replace procedure the last remaining dubious end of lines are investigated and fixed. That is manual (question is asked if the replace should be done).
This saves me a lot of time. For an average book the Post-OCR and these specific S&R commands take no more than 2-5 minutes.
|