Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 12-13-2012, 10:56 AM   #1
Steubie
Junior Member
Steubie began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2012
Device: Nook
accents and entities in an epub

Currently I am finishing an index and ebook of a scholarly work -- seven languages, 467 footnotes, etc. The only remaining item prior to customer acceptance is to get the accented characters in French and German showing correctly.

My standard opening for XML files is:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">

This standard opening leaves all of the &Eacute; and &ocirc;, etc. showing in the text.

I have tried following up the DOCTPE lines above with
<!ENTITY HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin 1 for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">

Flight Crew and EPubCheck are very unhappy with this.

Any suggestions?
Steubie is offline   Reply With Quote
Old 12-13-2012, 12:54 PM   #2
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
If your file is encoded in UTF (so not only the declaration), it should not be a problem. Are you saying that the HTML entities remain in the rendered text or in the code? The second is no issue, the first one quite peculiar.
Also don't touch the DOCTYPE.
Toxaris is offline   Reply With Quote
Advert
Old 12-13-2012, 02:36 PM   #3
Steubie
Junior Member
Steubie began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2012
Device: Nook
The file is entirely printable ASCII characters 0x20 through 0x7e along with 0x0a. This lets me create and manipulate data in a word processor.

The rendered text looks like this example: "Hippolyte Hemmer, Cl&eacute;ment de Rome: &Eacute;p&icirc;tre aux Corinthiens..." etc.
Steubie is offline   Reply With Quote
Old 12-13-2012, 03:21 PM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Check to be certain those entities aren't being xml escaped. If you entered (pasted, typed, whatever...) all the data from your Word Processor document into a WYSIWYG editor such as Sigil's Book View, that's likely to happen. Entities need to be pasted/typed into Code View (speaking strictly about Sigil here)... because they're, well... code. Otherwise &Eacute; becomes &amp;Eacute;. Just like <p> becomes &lt;p&gt;. What, if anything, are you using to build/create the ePub from your word processor document?

Last edited by DiapDealer; 12-13-2012 at 06:01 PM.
DiapDealer is offline   Reply With Quote
Old 12-14-2012, 06:37 AM   #5
Steubie
Junior Member
Steubie began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2012
Device: Nook
I posted a response yesterday, but do not see it here.

DiapDealer -- You were correct. My MakeEpub program was changing ampersands to %amp;. I made changes in the source code. The ebook now shows accents correctly. Many thanks.
Steubie is offline   Reply With Quote
Advert
Old 12-14-2012, 06:38 AM   #6
Steubie
Junior Member
Steubie began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2012
Device: Nook
Correction to type: Make that &amp;
Steubie is offline   Reply With Quote
Old 12-14-2012, 07:13 AM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
All's well that ends well.
DiapDealer is offline   Reply With Quote
Old 12-19-2012, 04:36 PM   #8
dgatwood
Curmudgeon
dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.
 
dgatwood's Avatar
 
Posts: 629
Karma: 1623086
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
Warning: You should *not* use HTML entities like &eacute;. EPUB is based on XHTML, not HTML, and XHTML does not define any entities other than &amp;, &lt;, &gt;, &apos;. and &quot;—&, <, >, ', and ", respectively.

That means that other HTML entities are not technically legal in an EPUB file, and a reader would be within its rights to barf if it encounters them. You should always replace those entities with proper XML entities, e.g. & #233; or & #xe9; (without the space after the & in both cases, but I can't type them that way because this forum keeps translating them into é) instead of &eacute;.

I originally tried to provide an incomplete list of some common substitutions in the form of Perl regular expressions, but the forum ate those, too. Here's the same list as text.


prime -> #824
Prime -> #8243
ldquo -> #8220
rdquo -> #8221
lsquo -> #8216
rsquo -> #8217
mdash -> #8212. Suggest following this by character #8203 (zero-width space as a wrap hint).
ndash ->#8211. Again, suggest adding a zero-width space afterwards.
copy -> #169
trade -> #8482
deg -> #176
aacute -> #225
eacute -> #233
oacute -> #243
ntilde -> #241
iuml -> #239
ecirc -> #234
nbsp -> #160


For a full list, see http://www.fileformat.info/format/w3c/htmlentity.htm.

Last edited by dgatwood; 12-19-2012 at 04:55 PM.
dgatwood is offline   Reply With Quote
Old 12-20-2012, 12:56 PM   #9
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by dgatwood View Post
Warning: You should *not* use HTML entities like &eacute;. EPUB is based on XHTML, not HTML, and XHTML does not define any entities other than &amp;, &lt;, &gt;, &apos;. and &quot;—&, <, >, ', and ", respectively.
Are you sure? I think that applies to XML, but XHTML adds some things on top of XML (or enforces XML syntax on HTML), among them, I believe, the definition of a good deal of entities.

See also http://en.wikipedia.org/wiki/List_of...cters_in_XHTML
Jellby is offline   Reply With Quote
Old 12-20-2012, 02:47 PM   #10
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
At w3.org, there is this list of entities:
http://www.w3.org/2000/07/8378/xhtml...s/entities.xml
mrmikel is offline   Reply With Quote
Old 12-20-2012, 03:43 PM   #11
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
It all depends on the parser and the XHTML DTD. I've never run into an ePub parser that couldn't handle them (assuming proper declarations), but I suppose it's possible. Perhaps someone is confusing xhtml1.1 and ePub2 with xhtml5 and ePub3? Named entites are no longer technically valid in that situation.

Last edited by DiapDealer; 12-20-2012 at 04:01 PM.
DiapDealer is offline   Reply With Quote
Old 12-20-2012, 09:36 PM   #12
dgatwood
Curmudgeon
dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.
 
dgatwood's Avatar
 
Posts: 629
Karma: 1623086
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
Quote:
Originally Posted by Jellby View Post
Are you sure? I think that applies to XML, but XHTML adds some things on top of XML (or enforces XML syntax on HTML), among them, I believe, the definition of a good deal of entities.

See also http://en.wikipedia.org/wiki/List_of...cters_in_XHTML
Apparently I misremembered. Never mind.
dgatwood is offline   Reply With Quote
Old 04-15-2013, 09:31 PM   #13
SusanM
Bemused by possibilities
SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.SusanM ought to be getting tired of karma fortunes by now.
 
SusanM's Avatar
 
Posts: 58
Karma: 480244
Join Date: Jul 2012
Device: iPad3, Kobo
Kobo requests that you use decimal entities and not character entities. I assume that it would be the same for other retailers.

List of entities
http://www.derby.co.nz/web-development/entities.html
SusanM is offline   Reply With Quote
Old 04-16-2013, 12:32 AM   #14
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by SusanM View Post
Kobo requests that you use decimal entities and not character entities. I assume that it would be the same for other retailers.
Is that for books to be converted to their kepub format?
davidfor is offline   Reply With Quote
Old 04-16-2013, 01:19 AM   #15
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
I haven't seen that request from Kobo and it sounds a bit silly though. It is much easier to type (and remember...) the named HTML entities than their number equivalent.
Toxaris is offline   Reply With Quote
Reply

Tags
accents, entities, html1.1, xml files


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Epub no support for some named entities? Points ePub 25 11-19-2012 06:42 PM
decimal entities in ePub instead of character entities epub4ever Calibre 4 04-20-2012 02:27 AM
Epub format, B & N PubIt!, and HTML character entities jlandahl ePub 3 04-07-2011 04:38 AM
Problem with accents converting PDF to EPUB madeira Calibre 0 07-09-2010 05:15 PM
Test for custom entities in ePUB Jellby ePub 9 05-27-2009 06:45 AM


All times are GMT -4. The time now is 06:24 PM.


MobileRead.com is a privately owned, operated and funded community.