Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 07-20-2009, 02:17 AM   #1
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
Named entities or not?

I expect this is an elementary noob question, but I've seen different opinions voiced on this:

When preparing html text prior to creating an epub, should I use named entities (“ æ eg) or am I OK to use the 'real' symbols? My first reaction was that I should use the named entities, then I saw a suggestion that, provided everything was utf-8, all would be well.

My context is that I'm slowly learning how to convert .txt format text into .epub files for reading on my 505 and I'm working towards a consistent editing process. At the moment I'm not expecting to produce any other format.

Thanks in advance
alecE is offline   Reply With Quote
Old 07-20-2009, 02:46 AM   #2
ilovejedd
hopeless n00b
ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.ilovejedd ought to be getting tired of karma fortunes by now.
 
ilovejedd's Avatar
 
Posts: 5,126
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
I use HTML source files so I prefer UTF-8 encoding. Not sure how you'll handle character set recognition if working from plain text, though.
ilovejedd is offline   Reply With Quote
Old 07-20-2009, 06:19 AM   #3
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I use a mixture. I encode the files in UTF-8, but I still use named entities for characters I cannot easily input with the keyboard or which may be difficult to distinguish with my preferred editor and font. I input "á", "æ", "â", "ñ", etc., but "‘", "—", " "...
Jellby is offline   Reply With Quote
Old 07-20-2009, 07:07 AM   #4
netseeker
sleepless reader
netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.netseeker ought to be getting tired of karma fortunes by now.
 
netseeker's Avatar
 
Posts: 4,763
Karma: 615547
Join Date: Jan 2008
Location: Germany, near Stuttgart
Device: Sony PRS-505, PB 360° & 302, nook wi-fi, Kindle 3
I prefer using named entities even with UTF-8. Why? Because for some entities like "<", ">" and "&" named (or numerical) entities are necessary anyway and i prefer using one method for all entities.
netseeker is offline   Reply With Quote
Old 07-20-2009, 07:27 AM   #5
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
i use named entities for all special characters (&mdash, &ldquo etc. but also &agrave, &oelig, and &eacute etc.), it's a habit from coding for the web. i have noticed that when creating an xhtml file in dreamweaver, if you write the special character in the "design" box, it will automatically be encoded with the entity in the code ; however if you are working with the epub dtd most special characters will not be automatically encoded, this may mean that it's not specified as necessary with that doctype. but, since i am wary of (bad) suprises i think it's safer to use the named entities even with the utf-8 encoding. however your question makes me realise i have not verified what the epub standard specifically says about this ; interesting question.
zelda_pinwheel is offline   Reply With Quote
Old 07-20-2009, 07:49 AM   #6
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Technically, it is best to use numeric entities (& #1234; ) because they are most compatible (you may not realize it, but named entities need to be defined in the document's DTD, which prevents you from using the same representation in XHTML and, say, plain XML). But in reality I still use named entities, mainly because they are readable in plain text - I mean, if I see & ldquo; I know what it represents, unlike & #8220;
pepak is offline   Reply With Quote
Old 07-20-2009, 08:01 AM   #7
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I've said that before, but I also use a mix of named an numbered entities for quotes and apostrophes. I use &rsquo; for a curly right single quote, and & #8217; for a curly apostrophe. They are exactly the same character, but it's nice to have them different in the source files if I want to search&replace or something.
Jellby is offline   Reply With Quote
Old 07-20-2009, 08:03 AM   #8
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
Quote:
Originally Posted by Jellby View Post
I've said that before, but I also use a mix of named an numbered entities for quotes and apostrophes. I use &rsquo; for a curly right single quote, and & #8217; for a curly apostrophe. They are exactly the same character, but it's nice to have them different in the source files if I want to search&replace or something.
ah, that is a clever trick...
zelda_pinwheel is offline   Reply With Quote
Old 07-20-2009, 09:39 AM   #9
Ankh
Guru
Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.
 
Ankh's Avatar
 
Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
Quote:
Originally Posted by Jellby View Post
I've said that before, but I also use a mix of named an numbered entities for quotes and apostrophes. I use &rsquo; for a curly right single quote, and & #8217; for a curly apostrophe. They are exactly the same character, but it's nice to have them different in the source files if I want to search&replace or something.
The best possible thing would be to use xhtml quote tags:
Code:
<q> </q>
, then define the quote characters in css, the way it is intended to be. Such a solution works for nested quotes, too.

Sadly, this doesn't work for me in ADE/505.
Ankh is offline   Reply With Quote
Old 07-20-2009, 10:21 AM   #10
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Ankh View Post
The best possible thing would be to use xhtml quote tags:
Code:
<q> </q>
, then define the quote characters in css, the way it is intended to be. Such a solution works for nested quotes, too.
That would be using the "quotes" property, right? Unfortunately, it appears "quotes" is not supported in the current ePUB spec.

Another problem: How does it work with multi-paragraph quotes? I believe the usual English practice is to add an opening quote character before each new paragraph, while in Spanish it's the closing quote. And verses or letters inside a quoted text?
Jellby is offline   Reply With Quote
Old 07-20-2009, 10:22 AM   #11
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by Ankh View Post
The best possible thing would be to use xhtml quote tags:
Code:
<q> </q>
, then define the quote characters in css, the way it is intended to be. Such a solution works for nested quotes, too.
It wouldn't work for non-paired quotes, though, which is quite common in many american books:
Quote:
Originally Posted by example
"Some long paragraph spoken by A.
"A still talks.
"Even more talking from A.
"Finally A concludes his lengthy statement."
pepak is offline   Reply With Quote
Old 07-20-2009, 11:29 AM   #12
Ankh
Guru
Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.
 
Ankh's Avatar
 
Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
Quote:
Originally Posted by Jellby View Post
That would be using the "quotes" property, right? Unfortunately, it appears "quotes" is not supported in the current ePUB spec.
Darn! RTFM, Ankh, RTFM.

Thanks Jellby, although I still don't see the logic behind omission of that specific css property from ePub spec.

Quote:
Originally Posted by Jellby View Post
Another problem: How does it work with multi-paragraph quotes? I believe the usual English practice is to add an opening quote character before each new paragraph, while in Spanish it's the closing quote. And verses or letters inside a quoted text?
I would not mind treating that as an exception and revert to hard-coding quotes into the text. Those situations are rare, right?
Ankh is offline   Reply With Quote
Old 07-20-2009, 11:43 AM   #13
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by Ankh View Post
I would not mind treating that as an exception and revert to hard-coding quotes into the text. Those situations are rare, right?
Quite common with graphomaniac authors.
pepak is offline   Reply With Quote
Old 07-20-2009, 12:09 PM   #14
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Ankh View Post
I would not mind treating that as an exception and revert to hard-coding quotes into the text. Those situations are rare, right?
Not that rare. It may not occur in every chapter, but it tends to happen at least a few times in every one of the books I've made.

I tried a solution with custom named entities, but it doesn't seem to work as I expected.
Jellby is offline   Reply With Quote
Old 07-20-2009, 04:41 PM   #15
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
Thanks for all the responses - I *will* stick with named entities (and continue to specify utf8).
Neat trick re. the separation of curly right quote & apostrophe - hadn't thought of that so thanks again.
Maybe it's just the sort of books I read, but many of the books I've been playing with present the dreaded multi-paragraph un-matched quote problem (Buchan, Kipling, Maupassant just to name some recent examples). Sadly the solution of making that wretched character 'A' a non-person doesn't seem to be optimal.
Nested quotes - I've often encountered the solution of maintaining the outer quotes as proper double quotes, and then using single quotes for the inner section. So far I haven't encountered a triple-decker quote sandwich. (OK, I know, vast swathes of triply-nested quotes now sweeping in from the west...)
alecE is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PRS-600 Word and PDFs always named 'Tell me' houndstooth Sony Reader 3 07-24-2010 04:42 AM
Test for custom entities in ePUB Jellby ePub 9 05-27-2009 06:45 AM
Can I preserve entities when converting from html? (To avoid unicode on kindle) krunkster Calibre 1 04-07-2009 05:11 PM
You named it what? What were you thinking.... nrapallo Lounge 8 05-07-2008 05:44 PM
French writer named Hello Lobolover Lounge 8 04-12-2008 07:16 AM


All times are GMT -4. The time now is 01:56 AM.


MobileRead.com is a privately owned, operated and funded community.