01-13-2011, 06:56 AM | #1 |
Connoisseur
Posts: 92
Karma: 630
Join Date: Sep 2008
Location: Melbourne, Australia
Device: Kindle Voyage, obv.
|
Scanning/OCR going okay...converting to a final format isn't
Hello!
I'm just part way through scanning my first book, and I'm having some issues. I've got the scanning and OCR worked out now, to the point that I can get the text into a Word document with the formatting I like. However, here's where I run into trouble. The ultimate intended destination for these files is my Kindle. I have tried importing the Word file directly into Mobipocket Creator, but that doesn't seem to work as I'd like. Page breaks seem to be recognised, but line breaks don't. So, if I've put a heading like "Chapter One", I leave a blank line underneath it. When created by MobiCreator, though, the blank line has vanished. Also, one paragraph (and a sentence in another paragraph) decided to display as bold, for no apparent reason. Should I be trying to send the text from Word into another format first, and then convert to mobi later? Should I even be sending the text from the OCR program to Word? Any advice warmly received! |
01-14-2011, 12:17 AM | #2 | |
space cadet
Posts: 330
Karma: 2963633
Join Date: Aug 2007
Location: Seattle area
Device: Rocket PRO, gen3, Pocketbook360
|
Quote:
1. get it into *reasonable* format in word. Don't try to be too specific about things, and it's best if you can be both simple and specific about formats such as quoted text and chapter headings. 2. Download HarryT's procedures on using BookDesigner and Mobipocketcreator. These tools are available from MR, and questions here tend to get some help. 3. Save the Word file as RTF. (incidently, my OCR program has the option of saving into more than one version of RTF. Always choose the simplest version. I use the one aimed at WordPad, not Word, since WordPad has a more limited RTF feature set.) 4. Import the file into BookDesigner, and clean up everything. make your chapter headings, build your table of contents, make sure to search for all the silly little places where a space got set to a different format, and such. 5. Save the file from BookDesigner as HTML0 (which is the native format for the tool). Don't try to have BookDesigner build the mobi file 6. Following Harry's instructions, use MobipocketCreator to generate the .mobi file 7. Archive the html0 and mobi files. If you find you want to tweak things in a simple manner, go back to BookDesigner. If you have complicated formatting requirements, ask for more help, 'cause you're beyond me. |
|
Advert | |
|
01-14-2011, 07:12 AM | #3 | |
Connoisseur
Posts: 92
Karma: 630
Join Date: Sep 2008
Location: Melbourne, Australia
Device: Kindle Voyage, obv.
|
Quote:
To be honest, I don't know why I didn't think of using BD - I did use it a few times, ages ago, when I first got my Sony Reader, but I eventually drifted over to Sigil for ePub stuff. I'll definitely give the above a whirl, though - thanks! |
|
01-14-2011, 12:17 PM | #4 |
Connoisseur
Posts: 92
Karma: 630
Join Date: Sep 2008
Location: Melbourne, Australia
Device: Kindle Voyage, obv.
|
I've been testing this method today, and I'm very happy with my initial results. Soon I'll have an entirely digital collection of books, something I've wanted for a very long time!
Thanks gain for the assistance, Darqref! |
01-15-2011, 12:56 AM | #5 |
space cadet
Posts: 330
Karma: 2963633
Join Date: Aug 2007
Location: Seattle area
Device: Rocket PRO, gen3, Pocketbook360
|
My pleasure. My problem is at the pre-Ocr stage. I'm using a digital camera, and the hassle of setting up the book to take a picture of each page makes it more work than I want, most of the time. I have a flatbed scanner, but not a page feeder, and I find the camera gives me a more accurate scan anyway (since I don't have to bend a spine or otherwise have the page misaligned.)
|
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
General scanning/OCR advice? | bfollowell | Workshop | 2 | 10-31-2010 06:08 AM |
Recommendation for basic scanning software (non OCR) | yunti | Workshop | 1 | 11-27-2009 07:08 AM |
Converting OCR Text files | jedavis1 | Workshop | 10 | 10-01-2009 10:09 PM |
Best Format method for Scanning and storing Notes | yunti | Workshop | 3 | 09-13-2009 05:53 PM |
Preferred format for converting? | Covak | Sony Reader | 2 | 11-21-2007 10:59 PM |