View Single Post
Old 06-15-2018, 12:10 AM   #1
stumped
Wizard
stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.
 
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
Question conversion challenge for badly formatted kindle book

i recently borrowed a book via Kindle unlimited and wanted to engineer a temporary epub version to read with working chapters. ( which I would delete once read)

when I got into the converted from mobi code, the whole book appeared as two enormous paragraphs ( one per XHTML file) using <br... for the line breaks throughout.
and also <tt ... spans for all the body text and an outer bracket of <code </code. not sure what the code tags did except make it harder to edit...

a nightmare to fix with regex as the <br dont have matching closing tags

so I forced the book through various format conversions instead, e.g. into and back from docx, AZW hoping that the breaks would get auto transformed into paragraphs - no joy
is there a way to do that that I missed ? i.e. to automate changing all <br line breaks into separate paragraphs, using calibre conversions tools ?

eventually I laboriously fixed it using mutiple manual regex passes, and then located & restyled the chapters

arguably not worth it as the story was not that great anyway, but just wondering about the <br thing in case I encounter it again

[ the book in question in case anyone want to view the sample code - is Gardener Summer by Nova - part of the american Apocalyse series - other books on that series are formatted sensibly ]
https://www.amazon.co.uk/Gardener-Su...sap_bc?ie=UTF8
stumped is offline   Reply With Quote