Thread: Invisible text
View Single Post
Old 05-04-2009, 02:44 PM   #10
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
When I retrieve a Project Gutenberg ebook in HTML form, I usually leave the page number (href) references in, but remove the actual PG #'s using a RegEx, like the below example written in Perl:
Code:
#Remove page numbering
$html =~ s#<span class='pagenum'><(.*[^>])>.*</span>#<$1>#gi ;
$html =~ s#<span class=\"pagenum\"><(.*[^>])>.*</span>#<$1>#gi ;
It just leaves the <a name/id> reference i.e. <$1>.
nrapallo is offline   Reply With Quote