11-02-2010, 04:57 PM | #1 |
Member
Posts: 18
Karma: 126
Join Date: Oct 2010
Location: California, U.S.A.
Device: Kindle 3, iPhone 3GS, iPod touch 1st gen, iPad 2 (on order)
|
Question on Margin issues with OCR’d text processed for Kindle
First, THANKS for all the great information in this forum!
I’m a new Kindle (K3), Calibre and BookCreator user. I have EXTENSIVE OCR and scanning knowledge. I’m using MS Word 2007 running on Windows Vista (32-bit) and have begun using the BookCreator template. As a test, I OCR’d a couple of chapters of an old paperback book on an Epson flatbed scanner, saved them in RTF and imported the file into Word 2007 running the BookCreator DOT. Although I’m just starting to work through the formatting options in the BookCreator template, I was able to do some basic paragraph and font size formatting. I THOUGHT I removed all the in-paragraph forced CR so that the text would adjust to the Kindle’s standard margins. I saved the result in RTF and converted the RTF “book” via Calibre. I sent it to my K3 device and it arrived promptly in Mobi. The text is in an appropriate size but the text remains in a thin column format similar as in the original paperback book. How can I format my OCR’d text to relieve this margin restriction? Is this an OCR output format, Calibre or Kindle issue? Also, for Calibre what is the best format to save my Word document? RTF? HTML? Other? Thanks! Tim |
11-05-2010, 02:05 PM | #2 |
Guru
Posts: 860
Karma: 4380
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
|
Hello Tim
First get the mobipocket creator (it’s free) from the mobipocket website: http://www.mobipocket.com/en/Downloa...ilsCreator.asp Now the building of a mobipocket (the kindle reads this format) eBook: 1 - save your ocr result in formatted text (*.txt), not rtf . Its’ better to always begin with an unformatted text than it’s to “unformat” a ocr result; 2 - open the text in word, format it at will. Use styles, at least “normal” for the text and “heading 1” (2, 3 etc… if you have multilevel headings) for your chapters… this will guarantee that when building the eBook in mobipocket creator, it generates automatically the table of contents; 3 - save the result as one html file - you can import other formats but my experience html gives the least problems; 4 - open that file in mobipocket creator and fill in the metadata, put in a cover image and guarantee that it will build a table of contents from the “headings” of the html file file you have created - in the table of contents option of the program choose “create” and put “h1” (without the “s) if you have one level of headers (heading1), “h2” if you have a second level of headers too (heading2), etc; 5 - do not forget to save your work so you can come back if needed; 6 - when everything is completed choose “build”, it will generate your eBook with an extension of *.prc; 7 - load the file in your kindle and enjoy. Note: if you have doubts, juts navigate the website from where you downloaded the mobipocket program for tips on how to procede. Best regards, |
Advert | |
|
11-06-2010, 04:31 PM | #3 |
Member
Posts: 18
Karma: 126
Join Date: Oct 2010
Location: California, U.S.A.
Device: Kindle 3, iPhone 3GS, iPod touch 1st gen, iPad 2 (on order)
|
Thanks, that seems straightforward
Thanks for the detailed reply, DDHarriman.
I'll try mobipocket creator to add formatting in place of the BookCreator tool that I tried. BTW, I DID find a workaround to force margins to relieve that issue I was having. I worked through an 'Ebook Formatting Turorial' on linked from the Calibre site, and found I could just put some coding in the 'Extra CSS' box in Calibre to force margins. However, I can see the advantages of correctly formatting the source book in advance with Mobipocket creator or BookCreator. Thanks again for the community support! Tim |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
ePub to mobi margin question | jhawkins002 | Calibre | 3 | 09-07-2010 07:53 PM |
Margin question | Switch | Calibre | 11 | 05-13-2010 03:32 PM |
HTML to .MOBI: large l.h. margin; text cuts off on the rt. Ideas how to fix? | thorn | Calibre | 1 | 02-21-2010 01:47 AM |
Converting OCR Text files | jedavis1 | Workshop | 10 | 10-01-2009 10:09 PM |
PDF Image -> OCR -> text | frikk | Workshop | 9 | 07-08-2009 07:21 PM |