Sorry for my late answer.
This will come to a nice project.
First one, the line issue:
The PDF-Conversion does only extract the text. Therefore, there is no line info in the XHTML. However, we can fix it. In order to make this happen, I modify the first S&R and add a second placeholder for a chapter line called chapter-line and add this to the replacement text. As I need to have it in a separate line below, I need to add a <br> (\3).
Replace with:
\1<chapter-new>\2\3<chapter-line>\3
See picture Aufzeichnen6 S&R line 1.
Now you can run this conversion first time to look on the results (because you need to define after this the regex for implementing the line). The result is this:
Line 1: <p class="calibre1">
Line 2: <chapter-new id="calibre_toc_1" class="calibre3">Jetsam </chapter-new></p>
Line 3: <p class="calibre1"><chapter-line/></p>
The last line is the line to look at first. This need to be replaced by <hr>. Now I add a second S&R:
Search for: <p class="calibre1"><chapter-line/></p>
Replace with: <hr>
or if you like a bit styling
Replace with: <hr noshade size=1 width=70% align=center>
See picture Aufzeichnen6 S&R line 3. This one and the next S&R will be used later with an EPUB to EPUB conversion.
Second one, the formatting issue <chapter-new>:
Well, this is only a help construct and we need to get rid of it in a second conversion. I do not know an other way if you want to do it more or less automatically. The other way is to do it with Calibre editor, as it is simple S&R.
I will use for now the conversion. As you have already seen, there is an overlapping definition for chapters with calibre1 and calibre3 and in addition with the first placeholder <chapter-new>. Therefor I split it in three parts (the first and the last part I like to get rid of, the middle part I need to stay with).
Attention , this is a little tricky because you need to select over two lines for getting the hidden line break. Make a copy past from the wizard, (see picture Aufzeichnen7.jpeg) and select everything from:
class="calibre1">
<chapter-new id="calibre_toc_1" class="calibre3">Jetsam </chapter-new>
(Only the beginning p and the ending </p> is not in the selection because this is what will stay) and replace the part
id="calibre_toc_1" class="calibre3">Jetsam
with (.*) and set the rest before and behind in brackets (). Check with test. You need 19 occurrences. If this is ok, take this over and make the S&R complete:
Search for:
(class="calibre1">
<chapter-new)(.*)(</chapter-new>)
Replace with:
\2
Please move this S&R at position 2. (See picture Aufzeichnen6.jpg S&R line 2)
Finally, I made an adjustment for the CSS in Look&Feel:
chapter-new{text-align:center;font-weight:bold;}
and in Structure Detection: Enable Remove first image
Here we are. Everything is prepared. Delete every format excluding the PDF. Then make first the conversion PDF to EPUB (take care of the Line un-warping factor) and then make an EPUB to EPUB conversion.
If it looks like this, then everything went fine und you can do your personal fine-tuning:
Last edited by Divingduck; 04-04-2014 at 04:44 PM.
|