MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Conversion (https://www.mobileread.com/forums/forumdisplay.php?f=235)
-   -   Chapter detection for azw3 to <anything> (https://www.mobileread.com/forums/showthread.php?t=334140)

snarkophilus 10-18-2020 02:26 AM

Chapter detection for azw3 to <anything>
 
1 Attachment(s)
Hi folks,

I've got a number of books from Amazon recently where each chapter has two heading tags which results in the default calibre conversion splitting the first part of a new chapter title into its own page and the second part on the next page with the rest of the chapter text. This is an example of the html:

Code:

    <div class="heading heading-with-title heading-without-image" id="chapter-1-heading" aid="3Q282">
      <div class="heading-contents" aid="3Q283">
        <div class="title-subtitle-block title-block-with-element-number" aid="3Q284">
          <div class="element-number-block" aid="3Q285">
            <h2 class="element-number" aid="3Q286">1</h2>
          </div>
          <div class="title-block" aid="3Q287">
            <h1 class="title" aid="3Q288">Descent</h1>
          </div>
        </div>
      </div>
    </div>

I know I can tinker with the "Detect chapters" XPath stuff in the structure detection options in the coversion dialogs, but I was wondering if the default XPath could be made smarter somehow to pick these cases up. I don't XPath enough to know if this is possible.

The attached zip files contains an example from "Colony Mars One" extracted with mobiunpack and reduced to just a simple example that shows the problem when run with ebook-convert.exe .

kovidgoyal 10-18-2020 04:05 AM

No, not really, this would require heuristics and examining the source tree, which XPath does not support.

snarkophilus 10-18-2020 06:06 AM

No worries, not the world's biggest problem! Thanks for answering.


All times are GMT -4. The time now is 10:02 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.