12-14-2017, 09:11 PM | #1 |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
What does saving a "poorly formed" epub do?
I was working on an epub. Making a lot of changes. Doing the 'mend and prettify' option and saving a lot. Everything going along nicely. Then when I went to save one time I got a warning window saying the epub was not properly formatted (or something like that, maybe 'properly formed') and it said somethig about fixing it etc. and that I might lose some data. I Xed out of the window, thinking that would give me time to see what the problem was, but when I did, the bottom of the screen said something like 'epub saved but you may have lost a small amount of data.'
Questions: 1) Why did it save it when I just Xed out of the window? 2) What should I have chosen to not save it if I get that window again? 3) If I get that window again and get out of it without saving it, will checking the Tools>'Well-formed check epub' tool tell me what's wrong? 4) (and i realize this is unanswerable in a concrete way) What are the chances I lost a lot of data? 5) What is the typical data loss in that situation? Thanks. |
12-14-2017, 10:04 PM | #2 |
Sigil Developer
Posts: 7,521
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Are you sure there was any data loss at all? That warning was really a leftover from when Tidy was used. Are you sure any save was made at all? Have you checked the modification date/time on that file to be sure. Do you have the Mend on Save set in your preferences?
Wasn't there a Cancel Button on that warning dialog? If you use Preview, it will show any well formed errors as you go. Yes you can use the well-formed check to find errors. As for typical loss, I have seen nothing at all lost, and at worst a short piece of html code xml escaped and treated as text. Typically this is very easy to see. |
Advert | |
|
12-14-2017, 10:23 PM | #3 |
Sigil Developer
Posts: 7,521
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Is this the message you saw:
Code:
if (ss.cleanOn() & CLEANON_SAVE) { if (not_well_formed) { QApplication::restoreOverrideCursor(); bool auto_fix = QMessageBox::Yes == QMessageBox::warning(this, tr("Sigil"), tr("This EPUB has HTML files that are not well formed and " "your current Clean Source preferences are set to automatically mend on Save. " "Saving a file that is not well formed will cause it to be automatically " "fixed, which very rarely may result in data loss.\n\n" "Do you want to automatically mend the files before saving?"), QMessageBox::Yes|QMessageBox::No); QApplication::setOverrideCursor(Qt::WaitCursor); if (auto_fix) { CleanSource::ReformatAll(resources, CleanSource::Mend); not_well_formed = false; } } else { CleanSource::ReformatAll(resources, CleanSource::Mend); } } Auto cleaning on save is equivalent to running Mend (with no prettify) on every xhtml file. If you typically use that anyway, I would simply look at the very last xhtml you edited without running mend to make sure it looks okay after reopening the epub. Last edited by KevinH; 12-14-2017 at 10:29 PM. |
12-14-2017, 11:11 PM | #4 |
Well trained by Cats
Posts: 29,703
Karma: 54369092
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I see this a lot. I think there is a Calibre plugin (Polish or Embed) that uses a header that Sigil does not like.
Code:
<!--?xml version="1.0" encoding="utf-8"?--><html xmlns:epub="http://www.idpf.org/2007/ops" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> I also use Modifyepub (kiwidudes version) and never used to see a complaint |
12-15-2017, 01:51 AM | #5 | |
Grand Sorcerer
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
The actual chances are rather slim, but there is a theoretical possibility that you might lose some text. |
|
Advert | |
|
12-15-2017, 02:51 AM | #6 |
Unicycle Daredevil
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
A related question: It has happened to me once or twice that after a code clean-up there would be some paragraphs of text without any tags around them. Is there an easy, regex-y way of spotting them? In my cases they were always at chapter starts, so weren't hard to find, but there might be other situations.
|
12-15-2017, 08:37 AM | #7 |
Sigil Developer
Posts: 7,521
Karma: 5433388
Join Date: Nov 2009
Device: many
|
As for data loss using gumbo, I have never actually seen it lose anything I have typed yet. What I have seen is some text losing its parent tag (because the tag was never properly closed), or a short snippet where the offending tag gets xhtml escaped and becomes text itself. Other fixes include adding a missing doctype, or converting an xml declaration to a comment because the file was missing a doctype.
So at worst, I had to fix-up the broken tag or add a tag around text. That said, I have not tried all combinations of poorly formed xhtml and some strange combination of errors might make gumbo freak out. But gumbo is much much more forgiving than tidy ever was. For me Preview is the way to go as in xhtml mode it will detect the first parsing error as you type telling you immediately if there was a problem, so the fix can be done right away by you and therefore preventing the possibility of wierd conbinations of mistakes in the same file later. |
12-15-2017, 09:38 AM | #8 | |
Grand Sorcerer
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Code:
ERROR(RSC-005): Error while parsing file: text not allowed here; expected the element end-tag or element "address", "blockquote", "del", "div", "dl", "h1", "h2", "h3", "h4", "h5", "h6", "hr", "ins", "noscript", "ns:svg", "ol", "p", "pre", "script", "table" or "ul" (with xmlns:ns="http://www.w3.org/2000/svg") Code:
Error schema not satisfied : no character data is allowed by content model near column 1 |
|
12-15-2017, 01:59 PM | #9 | |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
Quote:
Yes, I'm sure the save was made because at the bottom of the screen it said something like: 'EPUB saved but possibly with loss of data.' (I tried to re-create the incident on another file but was unable to.) I saved a bunch after that incident so checking the date/modification time won't help. I'm attaching my preferences and confess I don't know what the most beneficial settings would be. As I recall there was a Cancel button on the window. I guess I should've chosen that, but I thought Xing out of the window was the safest option and would essentially have the same effect as hitting the Cancel button. And I was using Preview and saw no warnings there. Good to know I can use the Well-formed Epub check to check for errors. And good to know the data loss was minimal. And as I said, that's what the Save message said as well. |
|
12-15-2017, 02:12 PM | #10 | |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
Quote:
No, I just saw a window with the buttons and the ability to X out of it. There were no extensive details like you've shown. As I recall the window gave the option to "fix" the poorly formed stuff but I avoided that. And there were probably three buttons. Maybe "Yes," "No" and "Cancel." I'd be curious to know what you think of my preference settings (I'll attach it again) and if there's anything I can do to make them work better for me. (I am a novelist and besides the cover and some images of other covers at the back of the book the epubs I make are exclusively text.) Honestly, this is getting away from me a bit in regards to what my optimal preference settings should be and whether I should be should be using the Code:
Tools>Reformat HTML>Mend and Pretify all HTML files I never use the Code:
Tools>Reformat HTML>Mend all HTML files Thanks. |
|
12-15-2017, 02:14 PM | #11 | |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
Quote:
|
|
12-15-2017, 02:17 PM | #12 | |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
Quote:
And good to know I probably didn't lose a lot of data. |
|
12-15-2017, 02:25 PM | #13 | |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
Quote:
And my work flow is Always an initial epub to epub convert in calibre Before opening for 1st time in sigil. That is to have calibre apply some basic css filters and generate the original epub fallback copy if I them mess up an edit So I agree with the hypothesis that calibre output is doing something that sigil input does not like I just let sigil fix what it want to fix, keep calm and carry on. Curious to know what it might be though. If there is anything worth saving next time it happens to assist with post mortem. ... |
|
12-15-2017, 05:37 PM | #14 |
Gregg Bell
Posts: 2,255
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
I found the window that came up. And how it plays out:
Yes: It mends it and saves it No: It saves with errors X out: It saves with errors |
12-15-2017, 05:50 PM | #15 |
Grand Sorcerer
Posts: 27,479
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Yes. The No is the default answer. You're not being offered a chance to abort the save--saving is a given. The only thing you can change is whether or not it Mends first. It's giving you a final chance to override your Save preference setting and not Mend the code first.
Last edited by DiapDealer; 12-15-2017 at 05:56 PM. |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
What does "Well-formed Check Epub" do? | Gregg Bell | Sigil | 3 | 09-14-2017 11:10 PM |
Bugs "Sanity Check epub" and "Failed Release date" on V 0.91 | qkiazd | Sigil | 5 | 12-03-2015 04:01 AM |
"HTML files that are not well formed" | automa | Sigil | 6 | 06-10-2014 11:25 AM |
Saving to disk problems, and question about "Date field" after importing | PO40600 | Library Management | 1 | 01-28-2013 10:42 PM |
epub "padding left" to mobi "block quote" conversion issue | 1611mac | Conversion | 3 | 01-11-2012 02:10 PM |