Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-18-2019, 01:36 AM   #1
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,587
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Entities oddities / 0.9.991 bug?

I may have found another bug in 0.9.991, but I'm struggling in how to describe it.

I have Preferences set to Mend on Open and Preserve Entities only for #160.

I have one epub (so far) that, when loaded into 0.9.991, all the character entities (quotes, apostrophes, etc.) persist, even though Prefs are set to preserve only #160. Running "Mend All HTML Files" fixes this: all the character entities are properly converted, and all is well.

I have another epub where the quotes and apostrophes are converted, but the non-breaking spaces show up as #x00A0. Again, running Mend fixes this: #x00A0 gets converted to #160 and all is well.

Now, when I load the exact same epubs into 0.9.18, all the character entities (except #160) are automatically and properly converted without my having to do anything extra.

So why is 0.9.991 struggling with these character entities, requiring me to run Mend manually, when 0.9.18 is handling it all seamlessly and automatically right off the bat?

Note1: the first epub where none of the character entities converted had <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> which was corrected on Mend along with the character entities.

Note2: Both of these epubs originated as AZW3, brought into Sigil via KindleImport plugin. At first I thought it was a KindleImport plugin problem, but when I saved the epub then re-opened them, the character entities continued to persist. But manually running Mend fixed things. So it seems like Mend wasn't being run on Open, despite the Preference settings?

I will play with this more tomorrow to see if I can find more clues, but I wanted to throw this out there. Also, if anyone has suggestions for what I should look for, let me know.
odamizu is offline   Reply With Quote
Old 11-18-2019, 08:22 AM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
Sigil 0.9.991 is not struggling with character entities. Remember the goal of Sigil 0.9.991 is to load any epub "as-is", in other words make no changes. What you are seeing is how that epub currently handles its entities.

So as long as the xhtml files are well-formed (or do not have mend on open set), Sigil will not touch the files. The Preserve Entities are only used by Mend on Sigil as the gumbo parser that Sigil uses to mend, removes all entities and converts them to their character equivalent. After mending, your Preserve Entities settings are used to determine which ones should be converted back to entities.

So all of this is expected/desired behaviour ... in other words we do not want Sigil to touch or alter valid html source code. So if you want to only use your entities set, just run Mend as you discovered.

Mend is also run to update xhtml links, so any rename or move, will effectively do the same thing. As will Standardizing to Sigil norm.

Hope this helps,

Kevin
KevinH is offline   Reply With Quote
Advert
Old 11-18-2019, 01:31 PM   #3
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,587
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
I am confused now. I have Preferences set to Mend on Open.

But you're saying there is a difference between Preferences > Mend on Open and Tools > Reformat HTML > Mend All HTML Files?

Having Preferences set to Mend on Open will not convert entities? The two Mend commands are different?

Thank you
odamizu is offline   Reply With Quote
Old 11-18-2019, 01:56 PM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
No,
We test each file on import and only if it is not wellformed do we run Mend on it and (if and only if you allow that in preferences)
If it is valid xhtml we do not touch it on import.

Again, the point here is not to mess up a dev's existing code, unless really necessary.

Last edited by KevinH; 11-18-2019 at 02:13 PM.
KevinH is offline   Reply With Quote
Old 11-18-2019, 03:49 PM   #5
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,587
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Ah. Okay. So the Mend function is same, whether it's through Preferences > Mend on Open or Tools > Reformat HTML > Mend All HTML Files.

And checking "Mend on Open" in Preferences only causes Mend on Open to run if the file is not wellformed. If the file is wellformed, then Mend doesn't happen on Open, even if Preferences > Mend on Open is checked.

Am I understanding correctly now?

Thank you
odamizu is offline   Reply With Quote
Advert
Old 11-18-2019, 04:44 PM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
Yes, in other words we do not mend what isn't broken anymore unless you invoke Mend yourself.
KevinH is offline   Reply With Quote
Old 11-18-2019, 05:17 PM   #7
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,587
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Thank you for taking the time to explain. I think I get it now
odamizu is offline   Reply With Quote
Old 11-18-2019, 06:42 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Perhaps we should alter the wording slightly to clarify what's happening when opening/saving or when manually executing Mend? I can see how it might be a bit confusing to some. Maybe "Attempt to Fix XHTML Errors on Open" or something?

Or... since it seems that many of of Sigil's other main features (Save [if Mend on Save is checked], Rename, Split, Merge, Restructure, etc...) will trigger the entity substitution/preservation anyway, can we just make Mend on Open behave like Mend on Save does regarding entity preservation?

I have no idea what that would entail--and I'm certainly not trying to cause extra work--but there would be a certain symmetry/logic to Mend on Open/Save following the same roadmap RE entities when checked/unchecked, no?

In case that's not clear what I'm suggesting is:

1) Mend on Save/Open - handle entity preservation/substitution based on whether the option is checked or not in Preferences (that's how Mend on Save behaves now, for what it's worth)
2) The manual Reformat HTML->Mend would continue to behave as it does.

That's all "if possible" of course.

Last edited by DiapDealer; 11-18-2019 at 06:46 PM.
DiapDealer is offline   Reply With Quote
Old 11-18-2019, 07:06 PM   #9
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Also--and this maybe my fault--0.9.991 is converting all unicode no-break-space characters to &#160 upon opening if the Preserve Entities list is completely empty (EPUB2).

EDIT: maybe not my fault. I was thinking about my suggestion to do this. I think.
https://github.com/Sigil-Ebook/Sigil...339e73d8e9d0ef
DiapDealer is offline   Reply With Quote
Old 11-18-2019, 07:33 PM   #10
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
Yes we could change that option description to Mend XHTML Errors in Open.

We could also just run preserve entities itself not involving Mend on every file imported but that seems to be against the as-is adjustment.

As for Mend on Save, we could modify it not to pass everything to mend and instead parse for errors, and only mend the files with errors just like we do on importing an epub.

I will look into that tomorrow.

Quote:
Originally Posted by DiapDealer View Post
Perhaps we should alter the wording slightly to clarify what's happening when opening/saving or when manually executing Mend? I can see how it might be a bit confusing to some. Maybe "Attempt to Fix XHTML Errors on Open" or something?

Or... since it seems that many of of Sigil's other main features (Save [if Mend on Save is checked], Rename, Split, Merge, Restructure, etc...) will trigger the entity substitution/preservation anyway, can we just make Mend on Open behave like Mend on Save does regarding entity preservation?

I have no idea what that would entail--and I'm certainly not trying to cause extra work--but there would be a certain symmetry/logic to Mend on Open/Save following the same roadmap RE entities when checked/unchecked, no?

In case that's not clear what I'm suggesting is:

1) Mend on Save/Open - handle entity preservation/substitution based on whether the option is checked or not in Preferences (that's how Mend on Save behaves now, for what it's worth)
2) The manual Reformat HTML->Mend would continue to behave as it does.

That's all "if possible" of course.
KevinH is offline   Reply With Quote
Old 11-18-2019, 08:30 PM   #11
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I'm OK with either, actually. Consistency is what I'm thing of mostly. So if modifying Mend on Save to skip entity substitution (like Mend on Open does now) is feasible, that would serve just as well, I think.
DiapDealer is offline   Reply With Quote
Old 11-18-2019, 11:45 PM   #12
odamizu
just an egg
odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.odamizu ought to be getting tired of karma fortunes by now.
 
odamizu's Avatar
 
Posts: 1,587
Karma: 4300000
Join Date: Mar 2015
Device: Kindle, iOS
Quote:
Originally Posted by KevinH View Post
Yes we could change that option description to Mend XHTML Errors in Open.
I like the idea of Mend XHTML Errors on Open / Mend XHTML Errors on Close.

That is clear, consistent, and also alerts users like me that this function has changed since 0.9.18.

As long as the manual Reformat HTML > Mend continues to run preserve entities, I am happy and can adjust

Thank you!
odamizu is offline   Reply With Quote
Old 11-19-2019, 07:24 AM   #13
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by DiapDealer View Post
Also: 0.9.991 is converting all unicode no-break-space characters to &#160 upon opening if the Preserve Entities list is completely empty (EPUB2).
Has anyone else been able to duplicate this issue? Is the no-break-space being special-cased here?
DiapDealer is offline   Reply With Quote
Old 11-19-2019, 08:52 AM   #14
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
I will look into it. As I remember the default preserve entities list in Settings has the 160 in it only. I did not think it was ever special cased after the code to visually show the spaces went it.

I will check.
KevinH is offline   Reply With Quote
Old 11-19-2019, 10:11 AM   #15
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
I have now pushed the following to master ...

- rewording of Prefs to make it clear only broken xhtml files will be mended on open and save if selected

- changed Save to only run mend on broken xhtml files (if selected in prefs) to match how we handle it on open.
KevinH is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Metadata oddities MSWallack Marvin 3 11-20-2014 01:55 AM
Catalog oddities tamhas Library Management 7 07-25-2014 10:55 AM
decimal entities in ePub instead of character entities epub4ever Calibre 4 04-20-2012 02:27 AM
Anachronism or other oddities Hellmark General Discussions 34 05-03-2011 01:28 PM


All times are GMT -4. The time now is 11:51 PM.


MobileRead.com is a privately owned, operated and funded community.