![]() |
#166 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
No problem. And yes... I did "download for USB transfer" when I got the .mobi. Didn't feel like charging up the ol' K2 for 8 hours just to get it that way.
![]() |
![]() |
![]() |
![]() |
#167 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,094
Karma: 91592869
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
This may not matter to you, but I have found that POBI files pulled from a Kindle have B/W images. If you use Download & transfer via USB you instead get color images for the same magazine.
|
![]() |
![]() |
![]() |
#168 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Interesting. The problem stems from null characters in the kindle metadata in the opf file. Sigil can't parse the opf file with them present, so it barfs. Hence why it thinks everything is unmanifested. Epubcheck flags the raw output of KindleImport (for this magazine) as invalid as well. I just can't figure out why newer versions of Sigil can't cope with the null characters where older versions of Sigil (pre-0.9.10) could. It's a puzzler. Probably somewhere in our embedded python opf parser. That may have been introduced back around 0.9.10, but I can't be sure.
But at least there's a path forward! KindleImport IS turning the Kindle magazine into relatively complete epub. I just need to massage it a bit more before the final hand-off to Sigil, I think. Last edited by DiapDealer; 11-17-2020 at 05:15 PM. |
![]() |
![]() |
![]() |
#169 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
|
The parsing on Import is done by the Qt Xml parser. Having nulls inside the file probably freaks it out. You may want to tweak the KindleUnpack opf generation code to escape or encode them in some way.
|
![]() |
![]() |
![]() |
#170 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
It must've just been a fluke that the same nulls didn't freak out whatever version of Qt we were using back in 0.9.9.
I've got to parse the opf a bit after KindleUnpack gets done anyway, so I may just massage those weird characters away in the plugin. I might see about doing something about it in KindleUnpack eventually, but I've got a pretty quick and easy proof-of-concept fix working right now in the plugin. Unless someone specifically opts to keep the original mobi/amazon metadata, all of the metadata where the bad characters typically occur are stripped out of the resulting epub anyway (opf comments don't survive). |
![]() |
![]() |
![]() |
#171 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
|
It may also be an encoding issue as Null chars are not legal utf-8. I can try to see if we can workaround it in ImportEPUB inside Sigil if you can supply a test case if you want.
|
![]() |
![]() |
![]() |
#172 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
The confusing part is that I'm reading the KindleUnpack OPF file in as utf-8 encoded, and I'm writing it back as utf-8 encoded String.encode('utf-8'). Since encode defaults to "error" for characters it can't deal with, I don't understand how the bad characters are getting written to the file in the first place. Yet whenever I open the file in a text editor, there's the warning--bigger than life--that it contains characters that are incompatible with the encoding. .encoding(str, 'replace') has no effect, and .encoding(str, 'error') causes no abend. So I'm really at a loss as to how these illegal characters are getting written to the opf in the first place.
![]() I can replace the \0 and \1 characters easily enough (they seem to be the principle offenders), but that doesn't strike me as a very thorough or future-proof approach. |
![]() |
![]() |
![]() |
#173 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
|
Is this with Python 3.7, 3.8, or 3.9? Could that be what changed? Perhaps newer Pythons are allowing control characters to be escaped in some way on reading in internally and then converting them back when writing them out.
I thought I had seen some change related to that to allow arbitrary bytes in file names for Linux somewhere in python. Maybe there is command line flag that now controls this behaviour? All a wag on my part but it really does seem strange! |
![]() |
![]() |
![]() |
#174 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Python 3.8 right now. I'm seeing it in our Windows bundled Python 3.8, and Python 3.8.6 on Arch. They've not updated to Python 3.9 quite yet (or hadn't as of this morning, anyway).
It sure acts like the "surrogateescape" unicode error-handling strategy on decoding/encoding, but "strict" is supposed to be the default strategy, so I don't get it. Quote:
|
|
![]() |
![]() |
![]() |
#175 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
|
From a websearch, it appears that the default error handlers for stdin and stdout are now surrogateescape.
Perhaps we are piping or redirecting the output somehow? |
![]() |
![]() |
![]() |
#176 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 110
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
|
Wow! Looks like I opened a can of worms ...
|
![]() |
![]() |
![]() |
#177 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
|
Yes, but we *think* we have this one tracked down. Kindle EXTH item 114 Version_Number should actually be read in as a numeric value and not a string. Given the lack of documentation about this metadata item number we obviously interpreted its type incorrectly. Look for a new version of KindleUnpack with this and other fixes soon.
|
![]() |
![]() |
![]() |
#178 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,803
Karma: 146918083
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#179 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
If there's a new version of KindleUnpack that affects the Calibre and/or Sigil plugin versions, then yes... I will ALWAYS release new versions of the plugins. I always have. There's no need to ask. Both Calibre and the Sigil plugin have notification abilities to tell you of new versions. So when there's a new version, you'll know. Let the process work and stop fishing in the Sigil plugin threads for info about calibre plugins.
|
![]() |
![]() |
![]() |
#180 | |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 110
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Plugin] PunctuationSmarten Sigil plugin | DiapDealer | Plugins | 143 | 07-30-2025 06:58 AM |
[Plugin] ePub3-itizer - epub3 output plugin for Sigil | KevinH | Plugins | 457 | 05-28-2025 06:14 PM |
epubcheck plugin for Sigil | Doitsu | Plugins | 539 | 04-19-2025 08:45 AM |
kindlegen plugin for Sigil | Doitsu | Plugins | 173 | 10-15-2024 02:51 AM |
smoothRemove_v010 plugin for Sigil | kbanelas | Plugins | 15 | 01-27-2017 05:51 PM |