MobileRead Forums - View Single Post

kiwidude · 01-13-2011, 07:24 PM

Quote:

Originally Posted by cybmole

so I am now getting good results with this
find
([Ia-z,])</p>\s*<p>
replace with\1 plus a single space

which bypasses the calibre tags issue.

. I could expand the range to test for for digits / capitalized words but have not yet needed to.

I think your post of the regex got rather mangled? Searching for <p> at the end of your regex will get you nothing on any document converted with Calibre as there is no handling of the class on the <p> tag. And there is no replace expression displayed.

The theory of what you say is indeed what the OP on this thread was doing with their first post. However as has been mentioned before there are other "line endings" you would need to test for such as punctuation characters (colons, semi-colons, hyphens), numeric amounts etc. Your regex also wouldn't include uppercase words, foreign language characters and so on.

Also unless you step through each one then if your book includes poems laid out they will get trashed.

Expressions earlier in this thread and in others similar can improve readability of most of the paragraphs. However imho I think people do need to be reminded that the expressions in this thread will not catch "every" situation nor should they just blindly do "Replace All" because they saw a regex in a thread that someone said worked for them.