View Single Post
Old 04-04-2014, 03:19 PM   #6
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Sorry for my late answer.

This will come to a nice project.

First one, the line issue:
The PDF-Conversion does only extract the text. Therefore, there is no line info in the XHTML. However, we can fix it. In order to make this happen, I modify the first S&R and add a second placeholder for a chapter line called chapter-line and add this to the replacement text. As I need to have it in a separate line below, I need to add a <br> (\3).

Replace with:
\1<chapter-new>\2\3<chapter-line>\3

See picture Aufzeichnen6 S&R line 1.

Now you can run this conversion first time to look on the results (because you need to define after this the regex for implementing the line). The result is this:
Line 1: <p class="calibre1">
Line 2: <chapter-new id="calibre_toc_1" class="calibre3">Jetsam </chapter-new></p>
Line 3: <p class="calibre1"><chapter-line/></p>

The last line is the line to look at first. This need to be replaced by <hr>. Now I add a second S&R:
Search for: <p class="calibre1"><chapter-line/></p>
Replace with: <hr>
or if you like a bit styling
Replace with: <hr noshade size=1 width=70% align=center>
See picture Aufzeichnen6 S&R line 3. This one and the next S&R will be used later with an EPUB to EPUB conversion.


Second one, the formatting issue <chapter-new>:
Well, this is only a help construct and we need to get rid of it in a second conversion. I do not know an other way if you want to do it more or less automatically. The other way is to do it with Calibre editor, as it is simple S&R.

I will use for now the conversion. As you have already seen, there is an overlapping definition for chapters with calibre1 and calibre3 and in addition with the first placeholder <chapter-new>. Therefor I split it in three parts (the first and the last part I like to get rid of, the middle part I need to stay with).
Attention , this is a little tricky because you need to select over two lines for getting the hidden line break. Make a copy past from the wizard, (see picture Aufzeichnen7.jpeg) and select everything from:

class="calibre1">
<chapter-new id="calibre_toc_1" class="calibre3">Jetsam </chapter-new>

(Only the beginning p and the ending </p> is not in the selection because this is what will stay) and replace the part

id="calibre_toc_1" class="calibre3">Jetsam

with (.*) and set the rest before and behind in brackets (). Check with test. You need 19 occurrences. If this is ok, take this over and make the S&R complete:

Search for:
(class="calibre1">
<chapter-new)(.*)(</chapter-new>)
Replace with:
\2

Please move this S&R at position 2. (See picture Aufzeichnen6.jpg S&R line 2)

Finally, I made an adjustment for the CSS in Look&Feel:
chapter-new{text-align:center;font-weight:bold;}
and in Structure Detection: Enable Remove first image

Here we are. Everything is prepared. Delete every format excluding the PDF. Then make first the conversion PDF to EPUB (take care of the Line un-warping factor) and then make an EPUB to EPUB conversion.
If it looks like this, then everything went fine und you can do your personal fine-tuning:
Attached Thumbnails
Click image for larger version

Name:	Aufzeichnen6.JPG
Views:	225
Size:	115.4 KB
ID:	121228   Click image for larger version

Name:	Aufzeichnen7.JPG
Views:	210
Size:	182.4 KB
ID:	121231   Click image for larger version

Name:	Aufzeichnen8.JPG
Views:	219
Size:	307.1 KB
ID:	121232  
Attached Files
File Type: epub Anderanged, Kazee - Vol 1 eng,.epub (247.5 KB, 252 views)

Last edited by Divingduck; 04-04-2014 at 04:44 PM.
Divingduck is offline   Reply With Quote