View Single Post
Old 05-30-2008, 03:08 AM   #51
alexxxm
Addict
alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.alexxxm has a complete set of Star Wars action figures.
 
Posts: 223
Karma: 356
Join Date: Aug 2007
Device: Rocket; Hiebook; N700; Sony 505; Kindle DX ...
Quote:
Originally Posted by kovidgoyal View Post
For links just one level away -r 1 should do the trick. You can easily ask the scraper to follow only links of a certain type using the --match-regexp option
I'm still asking you here even thou I just discovered the other thread on "content" - I'll move there once I'm clarified with this:

I'm trying what you said, put in the bookit options "Max recursions=1", but I'm having trouble with regexps:
from the site http://www.cityguide.travel-guides.c...rope/Lyon.html

I wanted to follow all the internal links having "72" in the address:

I tried putting Meta-data>Additional parameters
"--match-regexp 72", "--match-regexp=72", "--match-regexp *72*", "--match-regexp=*72*", but none worked: it just saves the original page and that's all

any hint?

thanks...

alessandro
alexxxm is offline   Reply With Quote