|  07-13-2013, 09:27 AM | #1 | 
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | 
				
				Sigil doesn't like named entities
			 
			
			I am revising a book published a few years ago when I still trusted Word's export to html, but when I open it in Sigil, I get an error at the first named entity, which was an – and when I fixed that, it choked at the very next one, with many more waiting down the line. Since Sigil has never before complained about my use of named entities (I prefer them to numerical entities because I can read them as I go), I'm guessing that this is because Microsoft has its own peculiar way of rendering html entities, which show on my text editor as ^R, ^V, and so forth. Is there an easy fix for this, such as by telling Sigil there are two entity modes in use? Mixing Word export and named entities has never caused me a problem in uploading html to Amazon or Barnes & Noble for conversion. Indeed, the very html file I have opened is already on sale as a Kindle and an epub book. | 
|   |   | 
|  07-13-2013, 10:46 AM | #2 | 
| Well trained by Cats            Posts: 31,238 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | 
			
			I use them all the time in Sigil  (recently used &bul; ° 10 ° &bul; as a chapter # decoration) Something else is wrong. codepage? | 
|   |   | 
| Advert | |
|  | 
|  07-13-2013, 11:42 AM | #3 | 
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | 
			
			Perhaps Sigil wants to see two codepages, one for Word-generated html, the other for named entities? I ran the Word doc through word2cleanhtml.com online, and now all is fine, with the minor cavil that all the entities (Word's and mine) have become numerical entities. In any event, Sigil has no problem opening the book. | 
|   |   | 
|  07-13-2013, 11:48 AM | #4 | 
| Well trained by Cats            Posts: 31,238 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | 
			
			NJ did you think to Validate one of the failing files before feeding it to Sigil?  That might give a clue. | 
|   |   | 
|  07-13-2013, 12:50 PM | #5 | 
| Grand Sorcerer            Posts: 28,844 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | 
			
			Gotta be the declaration. Named entities work fine, for the most part, but not unless the doctype and all the namespace declarations allow for them. | 
|   |   | 
| Advert | |
|  | 
|  07-13-2013, 03:41 PM | #6 | |
| Bookmaker & Cat Slave            Posts: 11,503 Karma: 158448243 Join Date: Apr 2010 Location: Phoenix, AZ Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2 | Quote: 
 Hitch | |
|   |   | 
|  07-13-2013, 03:47 PM | #7 | |
| Well trained by Cats            Posts: 31,238 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | Quote: 
  Which makes me wonder ? WHY ? EPUB3 supposedly does not allow named entities. Try and remember a frickn' number  They call them (named entities) mnemonics for a reason   | |
|   |   | 
|  07-13-2013, 05:08 PM | #8 | 
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | 
			
			I wish I had a mnemonic so I would remember where that word came from! Yes, "declaration" must be the term I was looking for. I just use a generic <!DOCTYPE HTML>. Odd that Microsoft's ^R would pass muster, and the named entity not, when both are in the same file. I'd be less surprised if it were the other way around. Speaking of declarations, does MathML work in any e-book format? | 
|   |   | 
|  07-13-2013, 05:45 PM | #9 | |
| Bookmaker & Cat Slave            Posts: 11,503 Karma: 158448243 Join Date: Apr 2010 Location: Phoenix, AZ Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2 | Quote: 
 With regard to your second:  Even the big fancy iBooks volume on Algebra uses all images of the equations and forumlae. Maybe ePUB3 does, but I admit: I knoweth not if it does. Hitch | |
|   |   | 
|  07-14-2013, 03:08 AM | #10 | 
| frumious Bandersnatch            Posts: 7,570 Karma: 20150435 Join Date: Jan 2008 Location: Spaniard in Sweden Device: Cybook Orizon, Kobo Aura | 
			
			If I remember correctly, the DOCTYPE is not needed in ePub. But if it's present it must be correct (which implies you cannot have SVG code, for instance). It may be that Sigil wants it, but according to the spec, files without a DOCTYPE should be XHTML anyway (which normal named entities). The above is valid only for HTML-like files, of course, you cannot use named entities in XML files, such as the OPF or NCX, you must use numerical entities or the actual characters there. This means that you'd better not use named entities in titles, I guess (maybe Sigil could automatically convert entities to characters when creating the TOC, if it doesn't already). | 
|   |   | 
|  07-14-2013, 10:35 AM | #11 | |
| Junior Member  Posts: 3 Karma: 10 Join Date: Nov 2011 Device: none | Quote: 
 Code: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> Replace : <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> After that Sigil doesn't complain any more ! | |
|   |   | 
|  07-14-2013, 10:36 AM | #12 | |
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | Quote: 
 I can handle some Cyrillic because I once noticed that RESTAURANT was spelled PECTOPAH. | |
|   |   | 
|  07-14-2013, 10:47 AM | #13 | |
| Grand Sorcerer            Posts: 11,470 Karma: 13095790 Join Date: Aug 2007 Location: Grass Valley, CA Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7 | Quote: 
 | |
|   |   | 
|  07-14-2013, 11:12 PM | #14 | |
| Guru            Posts: 698 Karma: 150000 Join Date: Feb 2010 Device: none | Quote: 
 No flames please; this is just MHO. ::grin, duck, run::  Albert | |
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| can sigil convert html entities to regular punctuation? | Gregg Bell | Sigil | 31 | 07-01-2013 09:29 AM | 
| Epub no support for some named entities? | Points | ePub | 25 | 11-19-2012 06:42 PM | 
| named anchors fail validation in sigil | lkasdorf | Sigil | 2 | 04-20-2012 10:55 PM | 
| decimal entities in ePub instead of character entities | epub4ever | Calibre | 4 | 04-20-2012 02:27 AM | 
| Named entities or not? | alecE | ePub | 17 | 07-21-2009 12:24 PM |