View Single Post
Old 04-24-2015, 04:48 PM   #930
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by PandathePanda View Post
Tested a PD book from: https://www.mobileread.com/forums/sho...d.php?t=259583 and got the <div class="mbp_pagebreak" ...> Inserted into the epub after converting the mobi to epub. Yet unpacking the azw3 file, the resulting epub does not have this inserted.

So my guess it's formatting inserted during the conversion by calibre, and can safely be removed.
Quote:
Originally Posted by DiapDealer View Post
The problem with assuming that it's calibre added stuff that can be safely removed, is that an ebook could have been edited by someone AFTER the calibre conversion which added the mbppagebreak div stuff. Where files were split/and merged any number of different unforeseen ways (or code copied and pasted to somewhere where the pagebreak IS performing a wanted function in the middle of a file).
Okay, now that I've had a chance to look through some of my Kindle -> EPUB conversions...

The second book I opened, copyright 2014 and converted in January 2015, repeatedly uses <div class="mbp_pagebreak"/> in the middle of its two text documents to separate chapters. (The first document is frontmatter and a serial story, and the second is an unrelated story with backmatter.) There is an additional instance at the top of the second document.

I am strongly tempted to label this a Calibre issue, an artifact of the conversion process that should be handled by adjusting that feature. That doesn't do anything about any existing conversions, though, so I haven't completely (ahem) closed the book on it yet.

If I do include processing for this, it'll definitely be tied to "is a BODY tag adjacent?" and will handle cases - such as this one - where there's no "calibre_pb_\d+" ID attribute present. That does make things more complicated, though, and further feedback is welcome.

Meanwhile, I've received a copy of the page-count plugin and information on the checkbox tweaks, so I can look into that. If they're as minor as they sound, I have no qualms about porting them over. Optional feature, doesn't break anything - sounds like a win.
Rev. Bob is offline   Reply With Quote