I'm possibly veering somewhat off-topic here, but in the cases where you have both the OCR'ed text and the scanned images of each page, I like to make a two-column html-table with the images on the right and the text on the left. Then I can import that file into OpenOffice Writer, proofread, edit, and xhtml-format it, and save as plaintext. I enclose the bash-script I use on the pdf files of public-domain works available from the Norwegian National Library, and a screen dump of what a file looks like in OOffice. It could work on books from Google too, I think, though it'll probably have to be tweaked a bit.
<Grumble> Why are Windows-users allowed to upload their .bat files, while linux-users must zip their .sh files to upload them?</Grumble>