![]() |
#1 |
Member
![]() Posts: 16
Karma: 10
Join Date: Jan 2009
Device: iPod Touch
|
Certain hyphens being removed on HTML to ePub
Hi,
I've been converting HMTL files to ePub using Calibre, and then transferring them to my iPod Touch to read on Stanza. The problem is, certain dashes are being removed. Hyphenated words seem to make it through ok such as "one-hundred" but sentences where there is a long dash, breaking up the sentence, such as: "When you recover - and there is no "if"; you wouldn't be there if they didn't know they could fix you - you're still in the army" are being removed. If I take the original HTML file and use Stanza's desktop converter and convert to epub, all of the dashes survive the transferal. I sent the epub file that Calibre created to Lexcycle, and this was there response: "I don't see the dashes when I open the xhtml document in Safari. Since Stanza uses the same renderer as Safari, that's a good browser to preview how documents will look in Stanza. The problem is that your dashes are being represented by decimal 151, but you have declared that your document's encoding is UTF-8. 151 is em dash only for the windows-1252 (i.e. "latin1") encoding. You could fix this by using the proper UTF-8 encoding for the em dash (decimal 8212). But the easiest solution would be to just represent it using the HTML entity encoding of "—", which will allow you to bypass any character encoding issues altogether." Hopefully that will help with fixing the conversion issue? I don't know... I've also attached the original HTML file for analysis... |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,156
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
in the calibre convrsion optiond for this book specify th encoding as cp1252
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() Posts: 16
Karma: 10
Join Date: Jan 2009
Device: iPod Touch
|
ok I did that, and the hyphens are back, but it also added a new character.
Before every one of those hyphens I'm getting a capital A with a carrot over it. It looks like this: "When you recoverA- and there is no "if"; you wouldn't be there if they didn't know they could fix youA- you're still in the army" NOTE that the example above is not showing the little carrot accent over the capital A since I can't type it in this text box. ![]() thanks for the help.... |
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,156
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Will be fixed in the 0.6 release
|
![]() |
![]() |
![]() |
#5 |
Member
![]() Posts: 16
Karma: 10
Join Date: Jan 2009
Device: iPod Touch
|
You're the man!
|
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Soft Hyphens | wallcraft | Workshop | 29 | 06-12-2012 04:21 AM |
HTML to ePub? | martienne | ePub | 1 | 08-08-2010 07:05 PM |
HTML Book + non HTML TOC to epub | aarcane | Calibre | 4 | 03-02-2010 02:58 AM |
ePub eBooks (Fully Edited w/ TOC) Fanfiction, Forumfiction [Links removed by OP] | Guns4Hire | Reading Recommendations | 12 | 02-25-2010 03:53 AM |
Calibre deletes soft Hyphens in Epub ? | NASCARaddicted | Calibre | 4 | 09-20-2009 06:31 PM |