View Single Post
Old 09-28-2008, 01:26 PM   #13
Sparrow
Wizard
Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.Sparrow ought to be getting tired of karma fortunes by now.
 
Posts: 4,395
Karma: 1358132
Join Date: Nov 2007
Location: UK
Device: Palm TX, CyBook Gen3
Quote:
Originally Posted by Richard Herley View Post
In Word's find-and-replace dialog, ^p will find the end of a paragraph and ^# will find any digit.

Assuming that combination doesn't occur in the text you want, and assuming the page numbering does not exceed 999:

find ^p^#^#^#
replace ^p

find ^p^#^#
replace ^p

find ^p^#
replace ^p


should strip 'em out!
Do you need to end the Find string with another ^p, to avoid catching chapter headings where there could be a number followed by text?

The 'PDF to HTML' program seemed to do a reasonable job.
There were quite a few blank rectangles scattered through the text - but these were easily removed in Word using ^g in the Find & Replace to get rid of graphics.
(There were 482 image files produced by 'PDF to HTML' - all small, none appeared significant.)
Sparrow is offline   Reply With Quote