I recently converted a book found only in PDF format to MS Word, and then using Calibre, to ePub. Much of the conversion went quite well. However, the formatting is occasionally "off." Usually, this is a matter of a missing
indentation of the first line of a new paragraph. I've traced this to the presence of this tag:
Code:
<p class="block_16">
, whereas the
properly indented paragraph first lines
usually (but not invariably) use
Code:
<p class="block_15">
. In correcting this, I've also found numerous
other variants of the
Code:
<p class="block_**">
. Some seem to work just like
Code:
<p class="block_15">
(e.g.,
Code:
<p class="block_38">
), but others are quite different in their effects. I have not enumerated all of the variants, but they are probably at least 10 in number.
FWIW, the
Code:
<p class="block_16">
occurs both at the beginning of a new paragraph, but also in mid-sentence of the last line of a paragraph. The result of the latter is to force a new, unindented paragraph consisted of the last fragment of that sentence.
1) Where is the best source for understanding the function of the many
Code:
<p class="block_**">
variants?
2) other than a search/replace of known offending variants, is there any way to fix these formatting glitches?
3) is there any way to reduce the likelihood of their being introduced in the first place?
Many thanks.