Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 11-17-2020, 01:12 PM   #166
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
No problem. And yes... I did "download for USB transfer" when I got the .mobi. Didn't feel like charging up the ol' K2 for 8 hours just to get it that way.
DiapDealer is offline   Reply With Quote
Old 11-17-2020, 01:29 PM   #167
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,094
Karma: 92190113
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Quote:
Originally Posted by jmurphy View Post
If I "download for USB transfer" or whatever it's called, I get mobis. If I download directly to the eink kindle (without using a PC) , and then browse the kindle from the PC, it is a pobi. For "reasons", my workflow is to pull the pobi directly from the kindle.
This may not matter to you, but I have found that POBI files pulled from a Kindle have B/W images. If you use Download & transfer via USB you instead get color images for the same magazine.
jhowell is offline   Reply With Quote
Advert
Old 11-17-2020, 05:12 PM   #168
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Interesting. The problem stems from null characters in the kindle metadata in the opf file. Sigil can't parse the opf file with them present, so it barfs. Hence why it thinks everything is unmanifested. Epubcheck flags the raw output of KindleImport (for this magazine) as invalid as well. I just can't figure out why newer versions of Sigil can't cope with the null characters where older versions of Sigil (pre-0.9.10) could. It's a puzzler. Probably somewhere in our embedded python opf parser. That may have been introduced back around 0.9.10, but I can't be sure.

But at least there's a path forward! KindleImport IS turning the Kindle magazine into relatively complete epub. I just need to massage it a bit more before the final hand-off to Sigil, I think.

Last edited by DiapDealer; 11-17-2020 at 05:15 PM.
DiapDealer is offline   Reply With Quote
Old 11-17-2020, 05:23 PM   #169
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
The parsing on Import is done by the Qt Xml parser. Having nulls inside the file probably freaks it out. You may want to tweak the KindleUnpack opf generation code to escape or encode them in some way.
KevinH is offline   Reply With Quote
Old 11-17-2020, 08:49 PM   #170
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
It must've just been a fluke that the same nulls didn't freak out whatever version of Qt we were using back in 0.9.9.

I've got to parse the opf a bit after KindleUnpack gets done anyway, so I may just massage those weird characters away in the plugin. I might see about doing something about it in KindleUnpack eventually, but I've got a pretty quick and easy proof-of-concept fix working right now in the plugin. Unless someone specifically opts to keep the original mobi/amazon metadata, all of the metadata where the bad characters typically occur are stripped out of the resulting epub anyway (opf comments don't survive).
DiapDealer is offline   Reply With Quote
Advert
Old 11-17-2020, 09:27 PM   #171
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
It may also be an encoding issue as Null chars are not legal utf-8. I can try to see if we can workaround it in ImportEPUB inside Sigil if you can supply a test case if you want.
KevinH is offline   Reply With Quote
Old 11-18-2020, 10:13 AM   #172
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
The confusing part is that I'm reading the KindleUnpack OPF file in as utf-8 encoded, and I'm writing it back as utf-8 encoded String.encode('utf-8'). Since encode defaults to "error" for characters it can't deal with, I don't understand how the bad characters are getting written to the file in the first place. Yet whenever I open the file in a text editor, there's the warning--bigger than life--that it contains characters that are incompatible with the encoding. .encoding(str, 'replace') has no effect, and .encoding(str, 'error') causes no abend. So I'm really at a loss as to how these illegal characters are getting written to the opf in the first place.

I can replace the \0 and \1 characters easily enough (they seem to be the principle offenders), but that doesn't strike me as a very thorough or future-proof approach.
DiapDealer is offline   Reply With Quote
Old 11-18-2020, 11:30 AM   #173
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
Is this with Python 3.7, 3.8, or 3.9? Could that be what changed? Perhaps newer Pythons are allowing control characters to be escaped in some way on reading in internally and then converting them back when writing them out.

I thought I had seen some change related to that to allow arbitrary bytes in file names for Linux somewhere in python. Maybe there is command line flag that now controls this behaviour?

All a wag on my part but it really does seem strange!
KevinH is offline   Reply With Quote
Old 11-18-2020, 01:27 PM   #174
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Python 3.8 right now. I'm seeing it in our Windows bundled Python 3.8, and Python 3.8.6 on Arch. They've not updated to Python 3.9 quite yet (or hadn't as of this morning, anyway).

It sure acts like the "surrogateescape" unicode error-handling strategy on decoding/encoding, but "strict" is supposed to be the default strategy, so I don't get it.

Quote:
strict: this is the default error handler that just raises UnicodeDecodeError for decoding problems and UnicodeEncodeError for encoding problems.

surrogateescape: this is the error handler that Python uses for most OS facing APIs to gracefully cope with encoding problems in the data supplied by the OS. It handles decoding errors by squirreling the data away in a little used part of the Unicode code point space (For those interested in more detail, see PEP 383). When encoding, it translates those hidden away values back into the exact original byte sequence that failed to decode correctly. Just as this is useful for OS APIs, it can make it easier to gracefully handle encoding problems in other contexts.

backslashreplace: this is an encoding error handler that converts code points that can’t be represented in the target encoding to the equivalent Python string numeric escape sequence. It makes it easy to ensure that UnicodeEncodeError will never be thrown, but doesn’t lose much information while doing so losing (since we don’t want encoding problems hiding error output, this error handler is enabled on sys.stderr by default).
DiapDealer is offline   Reply With Quote
Old 11-18-2020, 01:54 PM   #175
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
From a websearch, it appears that the default error handlers for stdin and stdout are now surrogateescape.

Perhaps we are piping or redirecting the output somehow?
KevinH is offline   Reply With Quote
Old 11-19-2020, 03:08 PM   #176
jmurphy
Zealot
jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.
 
Posts: 110
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
Wow! Looks like I opened a can of worms ...
jmurphy is offline   Reply With Quote
Old 11-19-2020, 03:21 PM   #177
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,841
Karma: 6120478
Join Date: Nov 2009
Device: many
Yes, but we *think* we have this one tracked down. Kindle EXTH item 114 Version_Number should actually be read in as a numeric value and not a string. Given the lack of documentation about this metadata item number we obviously interpreted its type incorrectly. Look for a new version of KindleUnpack with this and other fixes soon.
KevinH is offline   Reply With Quote
Old 11-19-2020, 03:46 PM   #178
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,808
Karma: 146918083
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by KevinH View Post
Yes, but we *think* we have this one tracked down. Kindle EXTH item 114 Version_Number should actually be read in as a numeric value and not a string. Given the lack of documentation about this metadata item number we obviously interpreted its type incorrectly. Look for a new version of KindleUnpack with this and other fixes soon.
Is there going to be a new KindleUnpack plugin for Calibre?
JSWolf is online now   Reply With Quote
Old 11-19-2020, 04:14 PM   #179
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,645
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
If there's a new version of KindleUnpack that affects the Calibre and/or Sigil plugin versions, then yes... I will ALWAYS release new versions of the plugins. I always have. There's no need to ask. Both Calibre and the Sigil plugin have notification abilities to tell you of new versions. So when there's a new version, you'll know. Let the process work and stop fishing in the Sigil plugin threads for info about calibre plugins.
DiapDealer is offline   Reply With Quote
Old 11-29-2020, 04:39 PM   #180
jmurphy
Zealot
jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.
 
Posts: 110
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
Quote:
Originally Posted by KevinH View Post
Yes, but we *think* we have this one tracked down. Kindle EXTH item 114 Version_Number should actually be read in as a numeric value and not a string. Given the lack of documentation about this metadata item number we obviously interpreted its type incorrectly. Look for a new version of KindleUnpack with this and other fixes soon.
Cool! Looking forward to seeing this and new versions of the software that uses KindleUnpack. Thanks, folks.
jmurphy is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Plugin] PunctuationSmarten Sigil plugin DiapDealer Plugins 143 07-30-2025 06:58 AM
[Plugin] ePub3-itizer - epub3 output plugin for Sigil KevinH Plugins 457 05-28-2025 06:14 PM
epubcheck plugin for Sigil Doitsu Plugins 539 04-19-2025 08:45 AM
kindlegen plugin for Sigil Doitsu Plugins 173 10-15-2024 02:51 AM
smoothRemove_v010 plugin for Sigil kbanelas Plugins 15 01-27-2017 05:51 PM


All times are GMT -4. The time now is 12:33 PM.


MobileRead.com is a privately owned, operated and funded community.