Finding strings only contained in <p>....</p>
Some ebooks capitalize for emphasis and some capitalize all proper names.
The following experssion easily finds all cap words in a file: (\w{Lu}+\w).
The problem is that it finds all caps to inclued those in headers and other places where caps are wanted.
I have been trying for some time to build a regex that will limit itself the those cap words between <p> tags with no success.
Is there a way to do this?
|