View Single Post
Old 05-16-2011, 04:57 AM   #3
DDHarriman
Guru
DDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura aboutDDHarriman has a spectacular aura about
 
Posts: 860
Karma: 4380
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hello

You can, per example, “cut out” the headers and the page numbers in the original files(s) used to do the OCR (if is this the way you are doing it).

Let’s imagine you scan your book and create an image PDF (unique file) with all the pages in the correct order.
Lets imagine you use Finereader Pro to do the OCR…

Do this:

1 - make a copy of your PDF with another name (protecting the original file if something goes wrong);

2 - open the new file in Finereader and use the “crop” option in the “edit page image” part to mark a rectangular selection in the page letting the headers and page numbers out of it, apply cut (to that page or to all of them) - be careful that this cannot be undone;

3 - OCR the result - presto no headers and page numbers.

Alternative - if you have per example Acrobat Pro, go to the margins configuration and redefine the top and bottom ones so the headers and page numbers are out of the new margins and save it with a new name. Open it on your OCR program and apply step (3) above.

You can do all the above with other programs too, just check the similar functions those programs have to the ones described above.

Best regards,
DDHarriman is offline   Reply With Quote