View Single Post
Old 10-21-2011, 01:16 AM   #5
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by Serpentine View Post
Amusing you want to catch all h1/2 etc:

find : <(h\d)[^<>]*>(.+)</\1>
replace : <\1>\2</\1>

If you are cleaning up books or something, it's often best to work from an HTMLZ export - In the format options you can select how CSS is handled, the 'tag' option will not add any extra stuff, tho I cant remember if it preserves bold/italic etc formatting - i.e placing <i> tags..
impressive, I wimped out & did multiple simpler reductions.

but I think your code misses the fact that there's both a chapter number and a chapter title within the dross, both of which should ideally be salvaged.
cybmole is offline   Reply With Quote