01-09-2011, 04:18 AM | #1 |
Zealot
Posts: 108
Karma: 10
Join Date: Dec 2010
Location: United States
Device: iPad Mini; iPhone; Kindle Paperwhite (10th gen)
|
new to scanning books
I would like to convert some of my books into Gutenberg-style text files easy to read on my iPod.
I just got my scanner and began experimenting with ABBYY Finereader. Very quickly a question came to my mind: Is there an easy way to remove the page numbers at the bottom and the "chapter titles" sometimes displayed at the top of each book page? ABBYY Finereader has an option called "remove headers and footers," which I thought might do the trick, but when I tried using that option, I got no results. Then there are the page breaks. I noticed that regardless of whether I checked the option for "keep page breaks," ABBYY was unable to correctly join or separate text from two consecutive pages. The text is always disjointed in some way where a page break occurs, and I must manually rejoin a paragraph which has been split in two, or separate two paragraphs which have gotten stuck together. Is there any easier way to deal with this issue? Thanks for your help. |
01-09-2011, 04:52 AM | #2 |
Wizard
Posts: 1,462
Karma: 6061516
Join Date: May 2008
Location: Cascais, Portugal
Device: Kindle PW, Samsung Galaxy Note Pro 12.2", OnePlus 6
|
If I were you, when scanning the pages, I would crop the header and the footer. You can still do it with your raw images and an image editor.
|
Advert | |
|
01-09-2011, 07:57 AM | #3 |
Fanatic
Posts: 551
Karma: 1121392
Join Date: May 2008
Location: USA
Device: HTC One M8
|
If you use Finereader's "save to Word" option, you'll find a "remove headers and footers" option cleverly concealed in the Word save options. It's a lot easier than cropping.
Joining up consecutive pages is best done manually in my personal opinion. You may sometimes need to check the original to see whether a new paragraph begins on the new page or not, or whether there is a scene break. Removing the page dividers then becomes part of the process of skimming through the Word file to clean up gross formatting errors. Another advantage of using Word is that its spellchecker is easier to use than Abbyy's, and it has excellent search and replace capabilities that you can use to fix common errors quickly. By saving the corrected file to filtered HTML, you get a file that Calibre can convert to various formats. That said, everyone has their own workflow (and mine is in a state of flux at the moment since I've switched to reading .epub). Last edited by wayrad; 01-09-2011 at 08:01 AM. |
01-09-2011, 07:58 AM | #4 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
It's also generally pretty easy to remove headers and footers with a regex search and replace.
ANY scanned book, though, is going to need a reasonable amount of manual editing to make it readable; there's no avoiding that. |
01-09-2011, 04:52 PM | #5 |
Zealot
Posts: 108
Karma: 10
Join Date: Dec 2010
Location: United States
Device: iPad Mini; iPhone; Kindle Paperwhite (10th gen)
|
Thanks to everyone for your helpful replies.
Wayrad, I discovered the button for "Remove headers and footers." In my first attempts the result was that it removed them on some pages, but missed them on others. I suspect this is because I need to have the book carefully placed so that its edges meet the edges of the scanner. I will keep experimenting. Last edited by sovre; 01-09-2011 at 05:09 PM. |
Advert | |
|
01-09-2011, 06:50 PM | #6 | |
Fanatic
Posts: 551
Karma: 1121392
Join Date: May 2008
Location: USA
Device: HTC One M8
|
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Scanning in your own books | gazza | News | 125 | 01-24-2016 04:42 PM |
Scanning books - New need help | Sporadic | Workshop | 9 | 04-19-2009 01:11 PM |
Scanning paper (out of copyright) books. | Charles Gray | Workshop | 18 | 03-25-2009 02:06 PM |
Scanning books | Nate the great | Lounge | 10 | 11-04-2007 01:20 AM |
Scanning books from your own library | Alexander Turcic | Deals and Resources (No Self-Promotion or Affiliate Links) | 13 | 06-16-2006 12:28 AM |