01-19-2014, 04:15 AM | #1 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
azw3 to epub - structure detection bug ?
I converted a retail azw3 to epub - with my usual preference settings ( heuristcs OFF)
but the conversion took a very long time, the resulting epub took even longer to open in sigil , and it seemed that the reason was that some chapters had been split into many, many XHTML files, with typically only 1 sentence per file. I re-ran the conversion with all structure detection settings blanked out & it was fine. i.e. I removed this //*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\s+', 'i')) or @class = 'chapter'] and removed this //*[name()='h1' or name()='h2'] & with those bits removed I ended up with 1 file per chapter, & "normal" conversion /load times. I don't do a lot of azw3 to epub, so I don't know if this was a specific issue with just one book or a more general problem ?. I Don't know how to inspect the source format to see what could have caused this. I used calibre 1.19 64 bit version. I am just flagging this as something that may been more investigation, if anyone else has similr conversion experiences |
01-19-2014, 04:41 AM | #2 |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
|
Advert | |
|
01-19-2014, 05:23 AM | #3 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
i don't know how to create & post an extract from a azw3 so I will leave it for now. Happy to arrange to provide a copy via PM for investigation. It could be that the whole book is needed to reproduce the "bug"
Also, I am not 100% sure that my structure detection xpath commands match the defaults or if I have previously tinkered with them, but they have never caused anything like this before |
01-19-2014, 05:41 AM | #4 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
|
|
01-19-2014, 05:49 AM | #5 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Yes, I know & that means creating a new bug reporting account, as I've lost my previous credentials, then finding that I probably can't attach the book anyway because of size restriction.
Frankly it is far too much hassle, for something that may be a 1-book-only glitch. I went to the trouble of posting what happened in case anyone else found that helpful. I am happy to arrange to forward the book as previously stated, but that's It. I f I have cause to convert another AZW3 that gives the same problems then I'll reconsider but otherwise, life's too short.... |
Advert | |
|
02-02-2014, 05:08 PM | #6 |
Junior Member
Posts: 1
Karma: 10
Join Date: Feb 2014
Device: Tolino Shine
|
Hi there.
New to this forum and had the same problem; guess that's actually no bug. Previously converted books from AZW3 to EPUB perfectly but the latest one got me over 3000 pages instead of ~700 Actually i'm not exercised in html but comparing the html from a book that worked fine with the wrong one it occurs that the AZW3 download from Amazon got the command <p class="chapter"> a lot of times between the text...therefore it makes a 'pagebreak' on the reader...either in the Calibre reader and also in the physical E-Book Reader. However, thx cybmole the solution from your primary post worked fine for me... |
02-02-2014, 07:24 PM | #7 | |
null operator (he/him)
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
RTF Structure Detection? | philhxc | Conversion | 1 | 11-09-2011 02:01 AM |
Structure Detection Problems | Jonnster | Conversion | 21 | 05-12-2011 02:12 PM |
Trouble w structure detection | jeff47 | Calibre | 1 | 10-13-2010 12:51 AM |
epub - force a 2nd pass to improve structure detection ? | cybmole | Calibre | 10 | 10-08-2010 01:00 AM |
Structure detection v5.5 and v6.2 | AlexBell | Calibre | 2 | 07-29-2009 10:11 PM |