07-13-2013, 09:27 AM | #1 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Sigil doesn't like named entities
I am revising a book published a few years ago when I still trusted Word's export to html, but when I open it in Sigil, I get an error at the first named entity, which was an – and when I fixed that, it choked at the very next one, with many more waiting down the line.
Since Sigil has never before complained about my use of named entities (I prefer them to numerical entities because I can read them as I go), I'm guessing that this is because Microsoft has its own peculiar way of rendering html entities, which show on my text editor as ^R, ^V, and so forth. Is there an easy fix for this, such as by telling Sigil there are two entity modes in use? Mixing Word export and named entities has never caused me a problem in uploading html to Amazon or Barnes & Noble for conversion. Indeed, the very html file I have opened is already on sale as a Kindle and an epub book. |
07-13-2013, 10:46 AM | #2 |
Well trained by Cats
Posts: 29,790
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I use them all the time in Sigil
(recently used &bul; ° 10 ° &bul; as a chapter # decoration) Something else is wrong. codepage? |
07-13-2013, 11:42 AM | #3 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Perhaps Sigil wants to see two codepages, one for Word-generated html, the other for named entities?
I ran the Word doc through word2cleanhtml.com online, and now all is fine, with the minor cavil that all the entities (Word's and mine) have become numerical entities. In any event, Sigil has no problem opening the book. |
07-13-2013, 11:48 AM | #4 |
Well trained by Cats
Posts: 29,790
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
NJ did you think to Validate one of the failing files before feeding it to Sigil?
That might give a clue. |
07-13-2013, 12:50 PM | #5 |
Grand Sorcerer
Posts: 27,547
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Gotta be the declaration.
Named entities work fine, for the most part, but not unless the doctype and all the namespace declarations allow for them. |
07-13-2013, 03:41 PM | #6 | |
Bookmaker & Cat Slave
Posts: 11,461
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Hitch |
|
07-13-2013, 03:47 PM | #7 | |
Well trained by Cats
Posts: 29,790
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Which makes me wonder ? WHY ? EPUB3 supposedly does not allow named entities. Try and remember a frickn' number They call them (named entities) mnemonics for a reason |
|
07-13-2013, 05:08 PM | #8 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
I wish I had a mnemonic so I would remember where that word came from!
Yes, "declaration" must be the term I was looking for. I just use a generic <!DOCTYPE HTML>. Odd that Microsoft's ^R would pass muster, and the named entity not, when both are in the same file. I'd be less surprised if it were the other way around. Speaking of declarations, does MathML work in any e-book format? |
07-13-2013, 05:45 PM | #9 | |
Bookmaker & Cat Slave
Posts: 11,461
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
With regard to your second: Even the big fancy iBooks volume on Algebra uses all images of the equations and forumlae. Maybe ePUB3 does, but I admit: I knoweth not if it does. Hitch |
|
07-14-2013, 03:08 AM | #10 |
frumious Bandersnatch
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
If I remember correctly, the DOCTYPE is not needed in ePub. But if it's present it must be correct (which implies you cannot have SVG code, for instance). It may be that Sigil wants it, but according to the spec, files without a DOCTYPE should be XHTML anyway (which normal named entities).
The above is valid only for HTML-like files, of course, you cannot use named entities in XML files, such as the OPF or NCX, you must use numerical entities or the actual characters there. This means that you'd better not use named entities in titles, I guess (maybe Sigil could automatically convert entities to characters when creating the TOC, if it doesn't already). |
07-14-2013, 10:35 AM | #11 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Nov 2011
Device: none
|
Quote:
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> Replace : <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> After that Sigil doesn't complain any more ! |
|
07-14-2013, 10:36 AM | #12 | |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Quote:
I can handle some Cyrillic because I once noticed that RESTAURANT was spelled PECTOPAH. |
|
07-14-2013, 10:47 AM | #13 | |
Grand Sorcerer
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
|
|
07-14-2013, 11:12 PM | #14 | |
Guru
Posts: 696
Karma: 150000
Join Date: Feb 2010
Device: none
|
Quote:
No flames please; this is just MHO. ::grin, duck, run:: Albert |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
can sigil convert html entities to regular punctuation? | Gregg Bell | Sigil | 31 | 07-01-2013 09:29 AM |
Epub no support for some named entities? | Points | ePub | 25 | 11-19-2012 06:42 PM |
named anchors fail validation in sigil | lkasdorf | Sigil | 2 | 04-20-2012 10:55 PM |
decimal entities in ePub instead of character entities | epub4ever | Calibre | 4 | 04-20-2012 02:27 AM |
Named entities or not? | alecE | ePub | 17 | 07-21-2009 12:24 PM |