|12-08-2010, 12:14 PM||#1|
Join Date: Jul 2010
Why do html entities get replaced upon import?
I notice that when I import an htm file into calibre (not conversion - just importing it into the library), calibre alters my document. For example, all of the character entities such as “ &rdquo, — get replaced with “ ” and — in the file.
Is this by design, or is it a bug? This should not occur - these entities are used in place of the characters for a reason. Simply importing a book shouldn't mess with the internals of it.
|12-08-2010, 12:21 PM||#2|
creator of calibre
Join Date: Oct 2006
Location: Mumbai, India
it's by design, and it isn't going to change. I lack the time to tell you why, you will just have to take my word for it.
EDIT: And if you dont want this behavior just put your html files in a zip file before importing them
Last edited by kovidgoyal; 12-08-2010 at 12:24 PM.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|html import remove userdefined Tags||gucky||Calibre||0||11-14-2010 09:35 AM|
|Html import broken in 0.7.25 - ok to rollback?||kiwidude||Calibre||7||10-31-2010 12:07 PM|
|Calibre can't import html exported by Acrobat?||greenapple||Calibre||0||02-11-2010 12:37 AM|
|importing html does not import images||reup||Calibre||12||12-08-2009 08:52 PM|
|Can I preserve entities when converting from html? (To avoid unicode on kindle)||krunkster||Calibre||1||04-07-2009 05:11 PM|