Quote:
Originally Posted by Manichean
Convert to ePub and use Sigil.
|
Do not agree.
For me, more flexible management of S&R is necessary in calibre.
There are some situations where the conversion from pdf-> html-> epub lose formatting.
example:
I have a text like this (converted by calibre pdftohtml engine):
__________________________
Code:
a bit 'of sticky stuff. I spent the index and I approached him on the nose. <br>
<hr>
<A name=39> </ a> tomato sauce. <br>
calibre epub conversion:
__________________________
Code:
<p class="calibre2"> bit 'of sticky stuff. I Spent the index and I approached HIM on the nose </ p>
<p class="calibre2"/>
<p class="calibre2"> tomato sauce. </ p>
in this case (when load epub html in sigil) I do not know if the break line is desired or wrong interpretation of <hr>
with S&R I can create a regex like this:
<br> \ s <hr> \ s <A name=\d+> </ a>
and replace wiht nothing.
Another example is un-wrapping:
Code:
The hottest summer of the century.<br>
Four homes lost in the corn. The major are plug-<br>
ged into the house. Six children on their bicycles<br>
epub:
Code:
<p class="calibre2">The hottest summer of the century.
Four homes lost in the corn. The major are plug-ged into the house. Six children on their bicycles</p>
I can't remove the character "-" in sigil because it can be used successfully in other circumstances (eg: mercedes-benz)...
with S&R i can create a regex:
([^\s]\-<br>)|([^\s]\-<br>\s*)
and replace with null string.
it's wrong?