Hello.
A friend has shared with me a small library of ebooks in the LRF format. I am not absolutely clear on their history, although I believe she originally purchased them in a DRMed format, and then converted them to non-DRMed LRF format for viewing on a Sony reader.
At any rate, I would like to read these books on my iPhone, using Stanza. Thus, I have converted a couple of these books to ePUB format, using default settings on Calibre. (I am running Calibre 0.6.49 on 64-bit Windows 7, which works great, BTW.)
The books I have thus far converted contain a couple of annoying elements, which I would like to eliminate. I suspect that doing so is possible using Calibre's rather extensive editing/formatting capabilities, but I am not experienced in using them. I am hopeful that a technically well-versed and well meaning soul can find it within him/herself to provide me some guidance.
Specifically, each of the ebooks I have converted contain repeating text, preceded by a page break. When I look at the resulting ePUB files in Sigil, the books are broken into numerous xHTML blocks, so I am guessing that during the conversion from LRF to ePUB, Calibre is interpreting the repeating text as following a "Chapter" break, or something similar. (In Stanza or the Calibre ebook reader, the text simply shows as repeating throughout the book.)
The repeating text looks like this:
Book Title (the book title as appearing in the ebook metadata)
Generated by ABC Amber LIT Converter, http://www.processtext.com/abclit.html
I suspect that the first line of repeating text (the book title) is being created automatically as, during conversion, the ebook is being interpreted as having a chapter break appearing just before this information is repeated. I imagine that this chapter data (which appears every few pages or so) is being misinterpreted, but as it is a systematic problem, the situation can be remedied by appropriate coding of Calibre's ebook processing engine.
The second line of repeating text seems clearly to have been added to the ebook prior to Calibre's conversion from LRF to ePUB format. It may very well appear due to the prior use of ABC Amber's ebook conversion software in the creation of the LRF-formatted ebooks themselves. I suspect that ABC Amber inserts this "advertisement" text when it interprets that a new chapter is occurring in the ebook. Which, of course, begs the question as to how this chapter information made its way into the ebook in the first place. At any rate, I imagine that to remove this line of repeating text, I need to invoke something like a "Search & Replace" function which, as opposed to referencing metadata as in the book title line of text, will require that this extraneous text be referenced exactly as it appears, so that the Calibre conversion engine can remove it throughout each of my ebooks during conversion.
So, if I understand the challenge correctly, I need to invoke Calibre's "intelligence" in two ways: first, with regard to removing "wild card" text referenced to metadata, and second, with regard to removal of specific text in the manner of a traditional "search & replace." On a related note, I need some way of better interpreting, and correctly processing, chapter breaks (if that is the nature of the page breaks which precede each instance of the repeating lines of text) although I am not clear as to the theory of how that task would be accomplished.
Anyone who may be able to provide me with guidance here (ideally, in a "This is a step-by-step procedure for you to follow, dummy"), that would be much appreciated.
I look forward to the courtesy of a reply. Many thanks.
Mark Shneour
mark@dotmom.com