Quote:
Originally Posted by LadyKate
One of the first steps after clearing up the excess spans and font settings in an html file is to check for paragraph markings.
A book that has <br> or <br/> as a means of separating paragraphs of text is not going to allow you to use calibre to indent the first line of paragraphs.
I will use a regex search setting the editpad to be sensitive to case to search for line breaks that are followed by a lower case letter. This will usually indicate that the break was put in to make it look right in a word processor.
eg search string would be
<br>
([a-z])
replacement string would be
\1
|
I'm confused.
I'm not much good with html and usually don't fix books by messing with html tags, but I would've thought I'd want most of those <br> tags replaced with </p><p>
I thought <br> didn't have a closing tag? Or is <br/> an alternate form of <br> ?
Quote:
Originally Posted by LadyKate
...Usually for the worst offenders, pdf files, I start off with either a save from either my rather outdated acrobat 7 or use the old mobipocket creator....
|
I save time by not converting or fixing PDFs at all. Nonfiction PDF: if unavailable in a better format, I keep the nonfiction PDF without any conversion or format-fixing, and easily read it in PDF format on any device. Fiction PDF: if unavailable in better format, never acquire fiction PDF equals never convert/fix fiction PDF -- I'd rather not have it than mess with it at all.