MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Help Please -Problem (https://www.mobileread.com/forums/showthread.php?t=333613)

Thasaidon 09-30-2020 01:30 AM

Help Please -Problem
 
I on Windows 10 enterprise and am using Calibre 5.0 and Sigil 1.10

I was tidying up an old ePub. I had just finished in Calibre Edit and the the ePub did not show any errors in Epubcheck or the Check Book tool.

I then tried to load the ePub into Sigil and got the following message

Quote:

Traceback (most recent call last):

File "C:/Prog1/Sigil/python3lib\opf_newparser.py", line 284, in get_package
(ver, uid, attr) = self.package

TypeError: cannot unpack non-iterable NoneType object
and Sigil would not open.

I then tried with another ePub in the series and got the same message BUT this time after I cancelled the message Sigil DID open the ePub. It did however give the message that there were problems opening the book.

Calibre Edit opens the files without problems.

Any Ideas?

DiapDealer 09-30-2020 07:32 AM

There's been some refactoring done with opf_newparser.py (and changes to opf parsing in general) since Sigil 1.1.0. But it looks to me that Sigil can't quire determine whether it's an epub2 or epub3. Without a sample epub that exhibits the issue, that's about all I've got.

Thasaidon 09-30-2020 08:58 AM

Quote:

Originally Posted by DiapDealer (Post 4041286)
There's been some refactoring done with opf_newparser.py (and changes to opf parsing in general) since Sigil 1.1.0. But it looks to me that Sigil can't quire determine whether it's an epub2 or epub3. Without a sample epub that exhibits the issue, that's about all I've got.

Thanks for that DiapDealer.
I am in the Philippines and there are problems with the internet in my area so I do not know if I can manage to send you a copy of one of the files That is causing this problem.

I thought it was probably the OPF and have been fiddling. I found if I save the text files, stylesheet and cover and import them into a blank Sigil created epub. I have no further problems. So at least I can carry on working.

I will keep a copy of one of the books for when they fix the internet here and send it to you by PM..

KevinH said there would be a new version of Sigil soon so I was waiting for that before I upgraded.

KevinH 09-30-2020 10:57 AM

No big changes in reading in from the opf ... but Sigil depends on knowing the opf version and the error does refer to the package tag being not-parseable. One quick thing that would help is to open it in Calibre and screenshot the opf in the Calibre Editor and post those screenshots here so that we can work on getting this fixed in time for the next release. Or even more simply, copy the bad epub and unzip that copy and manually extract the opf and post it here. That would probably be best.

DiapDealer 09-30-2020 12:58 PM

Another thing that almost slipped my mind: with the Calibre version being brand new, and all. Please verify that you're not trying to open these problem epubs in Sigil using any of calibre's Open With functionality (inherent or plugin).

Or if you are, please try to open one of these epubs with calibre completely out of the equation.

Thasaidon 10-01-2020 01:39 AM

1 Attachment(s)
Quote:

Originally Posted by DiapDealer (Post 4041397)
Or if you are, please try to open one of these epubs with calibre completely out of the equation.

Checked and it makes no difference.

As requested I have exported an OPF from one of the books and attached it to the post. If I have done it right.:)

DNSB 10-01-2020 02:14 AM

A quick look at the content.opf doesn't look quite right. Too many ns0:'s and other oddments that don't seem to belong there.

KevinH 10-01-2020 05:55 AM

Quote:

Originally Posted by Thasaidon (Post 4041658)
Checked and it makes no difference.

As requested I have exported an OPF from one of the books and attached it to the post. If I have done it right.:)

Yes, but I think still valid so this may be a bug in Sigil's opf parser.

In that opf they define a new namespace prefix ns0 that they assign for the default opf namespace and then therefore have to use it everywhere. Certainly not the most concise representation and notation but not incorrect xml as far as I can see.

This confuses the internal Sigil opf parser so I will have to fix that.

You can workaround this bug by doing the following to a copy of the opf file in any text editor. Ignore the single quotes, they are there just to delimit the strings.

1. Remove the unneeded prefix definition in the package tag by replacing

'xmlns:ns0="http://www.idpf.org/2007/opf"'

with:

'xmlns="http://www.idpf.org/2007/opf"'

2. remove the no longer needed prefix from the tag names by replacing all:

'<ns:'

with:

'<'

3. remove the no longer needed prefix from tag attributes by replacing all (notice the leading blank)

' ns0:'

with a single blank:

' '

Then save the changed opf and use it to replace the original one inside the epub zip.

I will work on tracking down why the lxml parser we use for pure xml files like the opf does not seem to simplify things properly.

Thasaidon 10-01-2020 06:39 AM

Quote:

Originally Posted by KevinH (Post 4041714)
Yes, but I think still valid so this may be a bug in Sigil's opf parser.

In that opf they define a new namespace prefix ns0 that they assign for the default opf namespace and then therefore have to use it everywhere. Certainly not the most concise representation and notation but not incorrect xml as far as I can see.

This confuses the internal Sigil opf parser so I will have to fix that.

Thank you. :thanks:

I will give your solution a try immediately.

KevinH 10-01-2020 04:58 PM

Okay, I have tracked down how this case got through lxml without it cleaning up the namespaces.

The problem is the cleaning routine ignored properly well-formed xml (and your content.opf was actually well formed but using strange namespace prefixes) and so passed it through untouched and its strange namespace usage never got properly cleaned up.

I will modify the xmlprocessor.py code to even clean up namespaces in xml files that are opf's to prevent this problem from occurring again.

I will push this fix to master tomorrow after some more testing and it will appear in the very next release.

Thank you for your bug report!

KevinH

KevinH 10-01-2020 08:51 PM

Just pushed a hopeful fix to master.

Thasaidon, when you get some time and better internet access, please pm me with a link to your problem epub to make sure my "fix" actually works.

Thanks.

Thasaidon 10-01-2020 10:34 PM

Quote:

Originally Posted by KevinH (Post 4042129)
Just pushed a hopeful fix to master.

Thasaidon, when you get some time and better internet access, please pm me with a link to your problem epub to make sure my "fix" actually works.

Thanks.

Thank you.

Will do.

Another bug squashed. :D

Thasaidon 10-02-2020 10:21 AM

1 Attachment(s)
Quote:

Originally Posted by KevinH (Post 4042129)
Just pushed a hopeful fix to master.

Thasaidon, when you get some time and better internet access, please pm me with a link to your problem epub to make sure my "fix" actually works.

Thanks.

Hi Kevin

Please find attached a copy of one of the problem books.

I thought I would give uploading a problem book a try. N.B. I have scrambled it but Sigil still gives the same error message when it opens it.

I tried to send it by PM but could not work out how to attatch it.


Thank you.

DiapDealer 10-02-2020 11:40 AM

Thanks! The scrambled version is much appreciated.

KevinH 10-02-2020 11:45 AM

As DiapDealer said. Thank you for your scrambled test case. Using the very latest build, I was easily able to open this epub in Sigil.

BTW, it still gave warnings but these are real warnings related to wrong media types for ttf fonts that have nothing to do with the extra namespace ns0: in the OPF.

The opf is now properly cleaned up on load.

So we should be good to go with the next release.

Thanks,

KevinH

ps here are those warnings:
Quote:

The OPF uses an unrecognized media type "application/octet-stream" for file "LiberationSerif-Bold.ttf" - A temporary media type of "font/ttf" has been assigned. You should edit your OPF file to fix this problem.

The OPF uses an unrecognized media type "application/octet-stream" for file "LiberationSerif-BoldItalic.ttf" - A temporary media type of "font/ttf" has been assigned. You should edit your OPF file to fix this problem.

The OPF uses an unrecognized media type "application/octet-stream" for file "LiberationSerif-Italic.ttf" - A temporary media type of "font/ttf" has been assigned. You should edit your OPF file to fix this problem.

The OPF uses an unrecognized media type "application/octet-stream" for file "LiberationSerif-Regular.ttf" - A temporary media type of "font/ttf" has been assigned. You should edit your OPF file to fix this problem.


All times are GMT -4. The time now is 10:52 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.