MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Sigil-0.9.0 Release (https://www.mobileread.com/forums/showthread.php?t=267232)

KevinH 11-11-2015 02:25 PM

Hi,

If there are parse errors in an xhtml file, then trying to parse the xhtml to change a link location can in fact mess it up even more. The only way to change the link location is by parsing the xhtml.

And adding a css link does not cause parse errors unless you add it improperly. So please post a test case where adding a css link causes previously good xhtml to become bad.

Please note, that Sigil 0.9.0 no longer uses Tidy so losing large pieces of text should not be an issue any longer. The official xhtml parsing rules via gumbo are used and they autocorrect just like a web-browser does.

KevinH

CalibUser 11-11-2015 02:44 PM

2 Attachment(s)
Hi DiapDealer and KevinH

I am attaching one of the test files that I use that generates the error and the css file.
The css file would not upload directly so I have put it in a zip file.
To reproduce the error in Sigil 0.9 using Python 3.4:

Uncheck "Use bundled Python" option in Sigil on the Preferences (Plugins) dialog.
Run the plugin.
When the main window for the plugin shows, select the css file by clicking on "Select files" and selecting the required css file (attached).
Uncheck all the boxes in the main window for the plugin except "Insert CSS file"
Select the xhtml file from the list box in the main window for the plugin.
Click "Process text"
After running the plugin, switching from the cover view window to the book view window will show the error message.

Repeat the same procedure in Sigil 0.87 - the error does not appear.

Thanks.

CalibUser 11-11-2015 03:03 PM

PS I forgot to mention that if you run the plugin without ticking "Insert CSS file" then the error does not occur.

DiapDealer 11-11-2015 03:24 PM

Something's definitely botching up the self-closing tags, but I'm having a hard time following what all is going on in the plugin. I may need to make a test plugin that just adds a css file.

CalibUser 11-11-2015 04:14 PM

1 Attachment(s)
Quote:

Originally Posted by DiapDealer (Post 3204162)
I may need to make a test plugin that just adds a css file.

I am attaching a plugin that will do this - you will need to edit the filename & path in the plugin (this did not corrupt the ePub when it ran).

DiapDealer 11-11-2015 05:01 PM

It seems to me that a plugin that adds a file to an epub is fine. And a plugin that writes to a file in an epub is fine. But for whatever reason ... a plugin that tries to do both is exposing some sort of crack that broken xhtml leaks out of.

Thanks for the report.

KevinH 11-12-2015 09:32 AM

Hi DiapDealer,

What is the exact sequence of actions that causes the issue? Perhaps there is a bug in the launcher code here someplace?

Is it?
1) add a css file
2) write to a xhtml file the new css link

Or is it?
1) write to an existing css file
2) add an xhtml file that references it

Or some other plugin sequence?

KevinH

DiapDealer 11-12-2015 10:01 AM

As far as I can see, the file type isn't really relevant. Nor is the sequence. Just the simple act of adding a file to an epub, AND writing to an existing xhtml file in the same plugin seems to be enough to make things go pear-shaped. Doing one or the other alone alone, doesn't.

Your test3 plugin that you used to verify hunspell/gumbo stuff was working in plugins causes the issue for me as well (at least on Windows). Using the Bundled Python3 OR External Python3

I'm pretty sure there is a launcher-code bug here somewhere, but I haven't been able track down where.

EDIT: Whatever it ends up being, it seems that writing to an existing file with a plugin seems to bypass the Preserve Entities logic.

KevinH 11-12-2015 10:16 AM

Hi DiapDealer,

Wow, that is strange. I just ran my testme3 plugin which both adds a file and writes to a file and everything worked just fine with no warnings about anything (Mac OS X with Embedded Python Interpreter)

For the testme3 plugin, exactly what error do you get? Is it the file you wrote too or the file you added that gets messed up?

KevinH

DiapDealer 11-12-2015 10:27 AM

Quote:

Originally Posted by KevinH (Post 3204608)
Hi DiapDealer,

Wow, that is strange. I just ran my testme3 plugin which both adds a file and writes to a file and everything worked just fine with no warnings about anything (Mac OS X with Embedded Python Interpreter)

For the testme3 plugin, exactly what error do you get? Is it the file you wrote too or the file you added that gets messed up?

KevinH

I used Calibuser's attached Test3.epub for the test:

1) Launch Sigil
2) Open Test3.epub
3) Verify that you can switch from CV to BV (Section0001.xhtml) with no errors.
4) leave Section0001.xhtml open in Code View.
5) run the testme3 plugin.
6) the Section0001.xhtml file becomes corrupted. When switching to BV a "not Well formed" error is issued (at or above line 5: html, head ...)
7) Entities that were set to be preserved are lost.

KevinH 11-12-2015 10:33 AM

Hi DiapDealer,

Quote:

EDIT: Whatever it ends up being, it seems that writing to an existing file with a plugin seems to bypass the Preserve Entities logic.
Yes, I assumed it was the plugins job not to mess up any existing entities. If you try to validate anything with named entities via gumbo, it may barf about the named entities as html5 only allows numeric entities but I don't think they should create a validation error.

I need to check that. But if you are seeing testme3 cause problems, then I doubt it is is an entity issue as it only reads a file and then writes it back unchanged.

KevinH

ps. Just saw your post above, I will give it a try.

KevinH 11-12-2015 10:48 AM

Hi DiapDealer,

Quote:

Originally Posted by DiapDealer (Post 3204613)
I used Calibuser's attached Test3.epub for the test:

1) Launch Sigil
2) Open Test3.epub
3) Verify that you can switch from CV to BV (Section0001.xhtml) with no errors.
4) leave Section0001.xhtml open in Code View.
5) run the testme3 plugin.
6) the Section0001.xhtml file becomes corrupted. When switching to BV a "not Well formed" error is issued (at or above line 5: html, head ...)
7) Entities that were set to be preserved are lost.

Yes, I see the error now. It appears to be that the link which is a void tag (self-closing) has somehow been written as a pair <link></link> which makes BV barf (and correctly so).

I will try to see what is going on here. The file being modified is being changed when all the plugin does is read from it and write it back so something is broken in the launcher/pluginrunner code that is improperly cleaning the file when it should not.

Kevin

KevinH 11-12-2015 11:04 AM

Hi DiapDealer,
Modified testme3 to print out exactly what it read in then write it out, and then reread it and print it again.

Everything was completely valid and byte for byte identical.

So whatever is creating the problem is somehow either in PluginRunner after completion of the plugin or somewhere else inside sigil when those files are copied back to the real resource before being reloaded.

Interesting bug ...

KevinH

KevinH 11-12-2015 11:19 AM

Hi DiapDealer,

The bug is in:

bool PluginRunner::checkIsWellFormed()

In this snippet ...

Code:

    if (!xmlFilesToCheck.isEmpty()) {
        foreach (QString href, xhtmlFilesToCheck) {
            // can't really validate without a full dtd so                                                                     
            // auto repair any xml file changes to be safe                                                                     
            QString filePath = m_outputDir + "/" + href;
            ui.statusLbl->setText("Status: checking " + href);
            QString data = Utility::ReadUnicodeTextFile(filePath);
            QString newdata = CleanSource::ProcessXML(data);
            Utility::WriteUnicodeTextFile(newdata, filePath);
        }
    }

It should have been only processing xml files, not xhtml files:

Code:

    if (!xmlFilesToCheck.isEmpty()) {
        foreach (QString href, xmlFilesToCheck) {
            // can't really validate without a full dtd so                                                                     
            // auto repair any xml file changes to be safe                                                                     
            QString filePath = m_outputDir + "/" + href;
            ui.statusLbl->setText("Status: checking " + href);
            QString data = Utility::ReadUnicodeTextFile(filePath);
            QString newdata = CleanSource::ProcessXML(data);
            Utility::WriteUnicodeTextFile(newdata, filePath);
        }
    }

Notice the change in the foreach. This was one of my bug fixes which in turn was broken itself. I will fix this and push the fix to master asap.
We still might want to check out entity handling.

KevinH

ps: fix now pushed to master
pps: rebuilt and I can confirm this fixes the issue my test platform Mac OSX

DiapDealer 11-12-2015 11:20 AM

Quote:

Originally Posted by KevinH (Post 3204639)
Interesting bug ...

If they're going to be present, they may as well be interesting. ;)


All times are GMT -4. The time now is 08:14 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.