Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 05-23-2013, 07:24 AM   #1
rexall
Member
rexall began at the beginning.
 
rexall's Avatar
 
Posts: 10
Karma: 10
Join Date: May 2011
Location: Khon Kaen, Thailand
Device: HTC Flier
Question junk charachters importing html to Calibre

Thu 23 May 2013, 6:05 pm

Hi All,

I scan a hard copy book to MS Word. Then save as "filtered html." Then import to Calibre. When I convert the zip file to ePub, quotation marks, apostrophes (possibly other characters) display as "wingdings" or some such strange font, specifically a black diamond with a question mark.

Any suggestions for a fix?

Thanks!

Rex in Thailand

[Promotional links deleted - MODERATOR]


Last edited by Dr. Drib; 12-06-2013 at 08:03 AM.
rexall is offline   Reply With Quote
Old 05-23-2013, 09:19 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
http://manual.calibre-ebook.com/faq....r-smart-quotes
kovidgoyal is offline   Reply With Quote
Advert
Old 05-23-2013, 11:18 AM   #3
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,208
Karma: 16534692
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
When saving from MSWord to Webpage filtered it should help if you make sure the output html always uses UTF-8 encoding. In my rather ancient (version 10) copy of MSWord that particular setting is found under Tools - Options - General - Web Options - Encoding - Save this document as - Unicode (UTF-8). But I'm afraid I don't know where Word has moved this option to in more recent versions.
jackie_w is offline   Reply With Quote
Old 05-23-2013, 06:45 PM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,553
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by jackie_w View Post
But I'm afraid I don't know where Word has moved this option to in more recent versions.
2007 is @ Word Options->Advanced->Web Options(scroll down to see it)->Encoding

@rexall - if you still have the problem and you have Word 2007/10/13 then try saving the file as a DOCX and using the DOCX_Input plug-in.

I've been using it for a couple of weeks. I can produce much cleaner output using DOCX_Input to convert DOCX to EPUB, than I was ever able to get by converting RTF or Filtered HTML to EPUB .

Consequently editing the EPUB in Sigil is now a viable proposition, primarily because I don't get any MS specific HTML in the EPUB.

Plus the docx files are up to 90% smaller than rtf's, and the conversions are about 30-40% faster. For what more could one ask, I ask .

The documents I convert are very vanilla, if a PDF has a complex layout, images, tables, infographics etc then I put up with the PDF. Which is less of a hardship since Moz put a FAQ PDF reader in Firefox.

You might want to adjust the plugin source code by hand to set the default option values to suit your needs as I did - see post #38 in the plug-in thread.

BR

Last edited by BetterRed; 05-23-2013 at 06:57 PM. Reason: Where to find Web Options in Word 2007
BetterRed is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Importing more files from HTML folder. sol_barbez Sigil 2 02-25-2011 10:02 AM
HTML importing problem PaladinBL Sigil 13 03-16-2010 05:03 PM
Importing HTML Files Shadowlane Calibre 1 12-19-2009 03:04 PM
Looking for Advice with an HTML Importing Problem deanstow Calibre 2 10-03-2009 05:14 PM
Calibre 0.6.15-16 importing html files as zips BKeeper Calibre 2 10-02-2009 03:29 AM


All times are GMT -4. The time now is 05:46 PM.


MobileRead.com is a privately owned, operated and funded community.