05-23-2013, 07:24 AM | #1 |
Member
Posts: 10
Karma: 10
Join Date: May 2011
Location: Khon Kaen, Thailand
Device: HTC Flier
|
junk charachters importing html to Calibre
Thu 23 May 2013, 6:05 pm
Hi All, I scan a hard copy book to MS Word. Then save as "filtered html." Then import to Calibre. When I convert the zip file to ePub, quotation marks, apostrophes (possibly other characters) display as "wingdings" or some such strange font, specifically a black diamond with a question mark. Any suggestions for a fix? Thanks! Rex in Thailand [Promotional links deleted - MODERATOR] Last edited by Dr. Drib; 12-06-2013 at 08:03 AM. |
05-23-2013, 09:19 AM | #2 |
creator of calibre
Posts: 43,853
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
Advert | |
|
05-23-2013, 11:18 AM | #3 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
When saving from MSWord to Webpage filtered it should help if you make sure the output html always uses UTF-8 encoding. In my rather ancient (version 10) copy of MSWord that particular setting is found under Tools - Options - General - Web Options - Encoding - Save this document as - Unicode (UTF-8). But I'm afraid I don't know where Word has moved this option to in more recent versions.
|
05-23-2013, 06:45 PM | #4 | |
null operator (he/him)
Posts: 20,567
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
@rexall - if you still have the problem and you have Word 2007/10/13 then try saving the file as a DOCX and using the DOCX_Input plug-in. I've been using it for a couple of weeks. I can produce much cleaner output using DOCX_Input to convert DOCX to EPUB, than I was ever able to get by converting RTF or Filtered HTML to EPUB . Consequently editing the EPUB in Sigil is now a viable proposition, primarily because I don't get any MS specific HTML in the EPUB. Plus the docx files are up to 90% smaller than rtf's, and the conversions are about 30-40% faster. For what more could one ask, I ask . The documents I convert are very vanilla, if a PDF has a complex layout, images, tables, infographics etc then I put up with the PDF. Which is less of a hardship since Moz put a FAQ PDF reader in Firefox. You might want to adjust the plugin source code by hand to set the default option values to suit your needs as I did - see post #38 in the plug-in thread. BR Last edited by BetterRed; 05-23-2013 at 06:57 PM. Reason: Where to find Web Options in Word 2007 |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Importing more files from HTML folder. | sol_barbez | Sigil | 2 | 02-25-2011 10:02 AM |
HTML importing problem | PaladinBL | Sigil | 13 | 03-16-2010 05:03 PM |
Importing HTML Files | Shadowlane | Calibre | 1 | 12-19-2009 03:04 PM |
Looking for Advice with an HTML Importing Problem | deanstow | Calibre | 2 | 10-03-2009 05:14 PM |
Calibre 0.6.15-16 importing html files as zips | BKeeper | Calibre | 2 | 10-02-2009 03:29 AM |