![]() |
#16 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 47,469
Karma: 171313058
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
For what it's worth, I prefer using the output of the KindleUnpack plugin instead of using calibre's convert ebook when converting azw3/KF8 to epub. If nothing else, KindleUnpack will do much less modification on the output file and comes a lot closer to duplicating the input epub (quite a few books are supplied to Amazon in epub format). I saw too many issues with calibre's convert ebook which does not remove epub3 bits and bobs when converting to epub2 and gets very enthused about splitting files (take a look at the footnotes files in the KindleUnpack version compared to the calibre conversion version).
I tried opening your scrambled azw3 with calibre's editor and it came up with multiple errors when I ran the editor's built-in check. Also, it took ~3 seconds to convert your scrambled azw3 on my system. |
![]() |
![]() |
![]() |
#17 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 792
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Sage
|
As several people suggested, I turned off Heuristic Processing. The conversion worked fine that way. Since this is the first time I've run into such a problem, I've got to wonder what this book is doing that's different enough to cause problems (of course, a hint might be that EpubCheck even fails to run on it at all). I guess I'll work my way through the list of heuristics and see if any individual one causes a problem. Of course, it's not vital to do that since I was able to get and edit the book as an epub3 via KindleUnpack. But, it might be interesting.
|
![]() |
![]() |
![]() |
#18 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 792
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Sage
|
Interesting. The heuristic option causing the problem is "Ensure scene breaks are consistently formatted" (from the manual -- "With this option calibre will attempt to detect common scene-break markers and ensure that they are center aligned. ‘Soft’ scene break markers, i.e. scene breaks only defined by extra white space, are styled to ensure that they will not be displayed in conjunction with page breaks."). If I turn that option off, the AZW3 converts fine. With it on, regardless of the other options, it doesn't convert.
|
![]() |
![]() |
![]() |
#19 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,642
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
A bit late to the party, but I gave the conversion a try with Heuristics enabled (which I have never used)
I seem to have got stuck on 1% using about 10% of my i7-10700 cpu. This is my log... https://paste.kodi.tv/ovameledid Once I disabled heuristics, it coverted in about 3 seconds. Trying to run EpubCheck on the azw3 version, I get an "AttributeError"... Spoiler:
. . |
![]() |
![]() |
![]() |
#20 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,642
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Strangely, I was able to make a minor edit to the book that fixes the hang.
If you go to part0035.html lines 234, 236 and 238 You will see... Quote:
Move the Italic span to avoid the no-break spaces and the conversion then works. Quote:
|
||
![]() |
![]() |
![]() |
#21 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 792
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Sage
|
Your line numbers are different from mine, but I'm almost positive those are the nbsp things I had wondered about earlier. Interesting that simply removing the italic formatting from them would allow the book to convert properly. Searching through the book, even though that code for an nbsp is used a LOT througout the book, that clump in part0035 seem to be the only ones that are styled with an italic.
EDIT: I also confirm your finding. I removed all five (?) italic-based spans around those nbsps in part0035 and the resulting AZW3 converts to EPUB3 without problem (even with heuristic processing on and doing the "Ensure scene breaks are consistently formatted" option). Good catch. Last edited by enuddleyarbl; 01-03-2023 at 01:33 PM. |
![]() |
![]() |
![]() |
#22 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,268
Karma: 148951761
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#23 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,617
Karma: 108669873
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
|
![]() |
![]() |
![]() |
#24 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,617
Karma: 108669873
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
|
![]() |
![]() |
![]() |
#25 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,268
Karma: 148951761
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#26 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,509
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
heuristics work using regexps and these sometimes become very slow depending on markup. In this case it will be the detect_soft_breaks() function whose regex is
Code:
(?P<initline><(?P<outer>p|div)[^>]*>\s*(<(?P<inner1>font|span|[ibu])[^>]*>)?\s*(<(?P<inner2>font|span|[ibu])[^>]*>)?\s*(<(?P<inner3>font|span|[ibu])[^>]*>)?\s*\s*(?P<init_content>.*?)(</(?P=inner3)>)?\s*(</(?P=inner2)>)?\s*(</(?P=inner1)>)?\s*</(?P=outer)>)\s*<div[^>]*>\s*</div>\s*(?P<line_two><(?P<linetwo_ter>p|div)[^>]*>\s*(<(?P<linetwo_ner1>font|span|[ibu])[^>]*>)?\s*(<(?P<linetwo_ner2>font|span|[ibu])[^>]*>)?\s*(<(?P<linetwo_ner3>font|span|[ibu])[^>]*>)?\s*\s*(?P<line_two_content>.*?)(</(?P=linetwo_ner3)>)?\s*(</(?P=linetwo_ner2)>)?\s*(</(?P=linetwo_ner1)>)?\s*</(?P=linetwo_ter)>) Last edited by kovidgoyal; 01-03-2023 at 11:55 PM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
a book which won't convert | Clio | Conversion | 5 | 04-23-2022 12:40 PM |
One book of series won't convert... | carpetmojo | Conversion | 6 | 10-07-2013 06:04 AM |
Calibre won't convert to ePub | Terry Odell | Calibre | 9 | 05-18-2012 08:41 AM |
Calibre won't convert books. | lilpretender | Calibre | 4 | 08-09-2009 04:16 PM |
Calibre won't convert files. | seajewel | Calibre | 2 | 07-13-2008 04:48 PM |