Quote:
Originally Posted by purcelljf
After scanning a book and exporting it to html, I frequently have separate paragraphs where the pages break in the document. ... I thought maybe there is a trick to this, so it doesn't take so much time?
|
There are lots of tricks.
We just need to know what software you are using and what are your skills.
Do you use OpenOffice.org writer, or MSOffice, or something else?
Do you konw what Regular Expression is?
As previous poster said, loking for paragraphs that begin with a lower cap letter would find the vast majority of such paragraphs.
You can also start looking for paragrephs that do not end with
. ? ! ." ?" !" .' ?' !' ... you get the idea.