![]() |
#1 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
|
Named entities or not?
I expect this is an elementary noob question, but I've seen different opinions voiced on this:
When preparing html text prior to creating an epub, should I use named entities (“ æ eg) or am I OK to use the 'real' symbols? My first reaction was that I should use the named entities, then I saw a suggestion that, provided everything was utf-8, all would be well. My context is that I'm slowly learning how to convert .txt format text into .epub files for reading on my 505 and I'm working towards a consistent editing process. At the moment I'm not expecting to produce any other format. Thanks in advance ![]() |
![]() |
![]() |
![]() |
#2 |
hopeless n00b
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,110
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
|
I use HTML source files so I prefer UTF-8 encoding. Not sure how you'll handle character set recognition if working from plain text, though.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,542
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
I use a mixture. I encode the files in UTF-8, but I still use named entities for characters I cannot easily input with the keyboard or which may be difficult to distinguish with my preferred editor and font. I input "á", "æ", "â", "ñ", etc., but "‘", "—", " "...
|
![]() |
![]() |
![]() |
#4 |
sleepless reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,763
Karma: 615547
Join Date: Jan 2008
Location: Germany, near Stuttgart
Device: Sony PRS-505, PB 360° & 302, nook wi-fi, Kindle 3
|
I prefer using named entities even with UTF-8. Why? Because for some entities like "<", ">" and "&" named (or numerical) entities are necessary anyway and i prefer using one method for all entities.
|
![]() |
![]() |
![]() |
#5 |
zeldinha zippy zeldissima
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
|
i use named entities for all special characters (&mdash, &ldquo etc. but also à, &oelig, and é etc.), it's a habit from coding for the web. i have noticed that when creating an xhtml file in dreamweaver, if you write the special character in the "design" box, it will automatically be encoded with the entity in the code ; however if you are working with the epub dtd most special characters will not be automatically encoded, this may mean that it's not specified as necessary with that doctype. but, since i am wary of (bad) suprises i think it's safer to use the named entities even with the utf-8 encoding. however your question makes me realise i have not verified what the epub standard specifically says about this ; interesting question.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
Technically, it is best to use numeric entities (& #1234; ) because they are most compatible (you may not realize it, but named entities need to be defined in the document's DTD, which prevents you from using the same representation in XHTML and, say, plain XML). But in reality I still use named entities, mainly because they are readable in plain text - I mean, if I see & ldquo; I know what it represents, unlike & #8220;
|
![]() |
![]() |
![]() |
#7 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,542
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
I've said that before, but I also use a mix of named an numbered entities for quotes and apostrophes. I use ’ for a curly right single quote, and & #8217; for a curly apostrophe. They are exactly the same character, but it's nice to have them different in the source files if I want to search&replace or something.
|
![]() |
![]() |
![]() |
#8 | |
zeldinha zippy zeldissima
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
Code:
<q> </q> ![]() Sadly, this doesn't work for me in ADE/505. |
|
![]() |
![]() |
![]() |
#10 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,542
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
Another problem: How does it work with multi-paragraph quotes? I believe the usual English practice is to add an opening quote character before each new paragraph, while in Spanish it's the closing quote. And verses or letters inside a quoted text? |
|
![]() |
![]() |
![]() |
#11 | ||
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#12 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
Thanks Jellby, although I still don't see the logic behind omission of that specific css property from ePub spec. I would not mind treating that as an exception and revert to hard-coding quotes into the text. Those situations are rare, right? |
|
![]() |
![]() |
![]() |
#13 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
|
![]() |
![]() |
![]() |
#14 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,542
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
I tried a solution with custom named entities, but it doesn't seem to work as I expected. |
|
![]() |
![]() |
![]() |
#15 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
|
Thanks for all the responses - I *will* stick with named entities (and continue to specify utf8).
Neat trick re. the separation of curly right quote & apostrophe - hadn't thought of that so thanks again. Maybe it's just the sort of books I read, but many of the books I've been playing with present the dreaded multi-paragraph un-matched quote problem (Buchan, Kipling, Maupassant just to name some recent examples). Sadly the solution of making that wretched character 'A' a non-person doesn't seem to be optimal. Nested quotes - I've often encountered the solution of maintaining the outer quotes as proper double quotes, and then using single quotes for the inner section. So far I haven't encountered a triple-decker quote sandwich. (OK, I know, vast swathes of triply-nested quotes now sweeping in from the west...) |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PRS-600 Word and PDFs always named 'Tell me' | houndstooth | Sony Reader | 3 | 07-24-2010 04:42 AM |
Test for custom entities in ePUB | Jellby | ePub | 9 | 05-27-2009 06:45 AM |
Can I preserve entities when converting from html? (To avoid unicode on kindle) | krunkster | Calibre | 1 | 04-07-2009 05:11 PM |
You named it what? What were you thinking.... | nrapallo | Lounge | 8 | 05-07-2008 05:44 PM |
French writer named Hello | Lobolover | Lounge | 8 | 04-12-2008 07:16 AM |