09-23-2010, 01:02 PM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
HTML eBook punctuaction
I have an eBook in HTML format that uses numeric tags for many kinds of punctuation. Here is an example:
The driver<92>s side door blew wide Where <92> is supposed to be an apostrophe. This doesnt seem to follow any standard, and the HTML document does not specify an encoding. Naturally when I convert this to MOBI using Calibre, there is no punctuation in the output. Has anyone seen this before? And how should I handle conversion? |
09-23-2010, 01:10 PM | #2 |
Addict
Posts: 292
Karma: 24688
Join Date: Aug 2009
Device: Sony PRS-505, iPad
|
Open the HTML file in your favorite text editor and do a simple find and replace. If you are comfortable with a straight single quote, just find <92> and replace with ' and just hit find and replace all. Save the file and then run it through Calibre.
If you want a curled left quote, replace the <92> with "& # 1 4 6 ;" (without the quotes, and remove the spaces between the characters that I included) and you should be good to go. For a handy table of HTML codes for various extended ASCII characters, go here. Last edited by cmdahler; 09-23-2010 at 01:12 PM. |
09-23-2010, 01:11 PM | #3 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
Quote:
|
|
09-23-2010, 01:33 PM | #4 |
creator of calibre
Posts: 44,336
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
that's an apostrophe encode in cp1252 encoding. See http://calibre-ebook.com/user_manual...r-smart-quotes
for how to handle these kinds of files |
09-23-2010, 01:43 PM | #5 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
Quote:
|
|
09-23-2010, 02:57 PM | #6 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
It's the code for a "smart" apostrophe (curled to the left single quote) in several character encodings. ASCII has 27 for the generic apostrophe, but CP1252 uses 92 (hex) for the curved to the left single quote.
|
Tags |
html |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calibre Recipe HTML content differs from raw html of index.html. | krunk | Calibre | 4 | 09-20-2010 09:48 PM |
Sigil freezes when I + HTML ebook. | Anarel | Sigil | 4 | 08-16-2010 11:13 AM |
Von HTML zum eBook | Hokuspokus | Erste Hilfe | 4 | 07-18-2010 11:52 AM |
ebook-convert html to lrf | dicknskip | Calibre | 1 | 05-11-2010 05:45 PM |
(x)html ebook specification | rogue_ronin | Other formats | 60 | 07-12-2009 01:13 AM |