Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-09-2011, 04:18 AM   #1
sovre
Connoisseur
sovre began at the beginning.
 
sovre's Avatar
 
Posts: 86
Karma: 10
Join Date: Dec 2010
Location: California
Device: iPod Touch; PRS-950
new to scanning books

I would like to convert some of my books into Gutenberg-style text files easy to read on my iPod.

I just got my scanner and began experimenting with ABBYY Finereader. Very quickly a question came to my mind: Is there an easy way to remove the page numbers at the bottom and the "chapter titles" sometimes displayed at the top of each book page? ABBYY Finereader has an option called "remove headers and footers," which I thought might do the trick, but when I tried using that option, I got no results.

Then there are the page breaks. I noticed that regardless of whether I checked the option for "keep page breaks," ABBYY was unable to correctly join or separate text from two consecutive pages. The text is always disjointed in some way where a page break occurs, and I must manually rejoin a paragraph which has been split in two, or separate two paragraphs which have gotten stuck together. Is there any easier way to deal with this issue?

Thanks for your help.
sovre is offline   Reply With Quote
Old 01-09-2011, 04:52 AM   #2
Over
Wizard
Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.Over ought to be getting tired of karma fortunes by now.
 
Over's Avatar
 
Posts: 1,449
Karma: 3949068
Join Date: May 2008
Location: Cascais, Portugal
Device: Cybook Gen3, Kindle DXi, Kindle 3, iPad and iPhone 4
If I were you, when scanning the pages, I would crop the header and the footer. You can still do it with your raw images and an image editor.
Over is offline   Reply With Quote
 
Enthusiast
Old 01-09-2011, 07:57 AM   #3
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 547
Karma: 1121392
Join Date: May 2008
Location: USA
Device: Galaxy Nexus
If you use Finereader's "save to Word" option, you'll find a "remove headers and footers" option cleverly concealed in the Word save options. It's a lot easier than cropping.

Joining up consecutive pages is best done manually in my personal opinion. You may sometimes need to check the original to see whether a new paragraph begins on the new page or not, or whether there is a scene break. Removing the page dividers then becomes part of the process of skimming through the Word file to clean up gross formatting errors.

Another advantage of using Word is that its spellchecker is easier to use than Abbyy's, and it has excellent search and replace capabilities that you can use to fix common errors quickly.

By saving the corrected file to filtered HTML, you get a file that Calibre can convert to various formats.

That said, everyone has their own workflow (and mine is in a state of flux at the moment since I've switched to reading .epub).

Last edited by wayrad; 01-09-2011 at 08:01 AM.
wayrad is offline   Reply With Quote
Old 01-09-2011, 07:58 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 63,498
Karma: 41548799
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
It's also generally pretty easy to remove headers and footers with a regex search and replace.

ANY scanned book, though, is going to need a reasonable amount of manual editing to make it readable; there's no avoiding that.
HarryT is online now   Reply With Quote
Old 01-09-2011, 04:52 PM   #5
sovre
Connoisseur
sovre began at the beginning.
 
sovre's Avatar
 
Posts: 86
Karma: 10
Join Date: Dec 2010
Location: California
Device: iPod Touch; PRS-950
Thanks to everyone for your helpful replies.

Wayrad,

I discovered the button for "Remove headers and footers." In my first attempts the result was that it removed them on some pages, but missed them on others. I suspect this is because I need to have the book carefully placed so that its edges meet the edges of the scanner.

I will keep experimenting.

Last edited by sovre; 01-09-2011 at 05:09 PM.
sovre is offline   Reply With Quote
Old 01-09-2011, 06:50 PM   #6
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 547
Karma: 1121392
Join Date: May 2008
Location: USA
Device: Galaxy Nexus
Quote:
Originally Posted by sovre View Post
Thanks to everyone for your helpful replies.

Wayrad,

I discovered the button for "Remove headers and footers." In my first attempts the result was that it removed them on some pages, but missed them on others. I suspect this is because I need to have the book carefully placed so that its edges meet the edges of the scanner.

I will keep experimenting.
I find that it occasionally misses, but those places are easy to fix when joining up pages. More often than not it works for me, though. That's an interesting observation about page placement - I will keep an eye out to see if I can confirm it with my scanner (Opticbook).
wayrad is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanning in your own books gazza News 115 12-28-2009 05:32 PM
Scanning books - New need help Sporadic Workshop 9 04-19-2009 01:11 PM
Scanning paper (out of copyright) books. Charles Gray Workshop 18 03-25-2009 02:06 PM
Scanning books Nate the great Lounge 10 11-04-2007 01:20 AM
Scanning books from your own library Alexander Turcic Deals, Freebies, and Resources (No Self-Promotion) 13 06-16-2006 12:28 AM


All times are GMT -4. The time now is 06:29 AM.


MobileRead.com is a privately owned, operated and funded community.