Quote:
Originally Posted by kovidgoyal
You should be able to use recursion and --match-regexps with web2lrf to follow the links and convert the entire thread.
|
Hi, Kovid
First of all, thanks for your excellent program. It makes using Sony Reader better than I ever expected.
BTW, I tried the method but I could not get the satisfactory result.
For example, starting from this link
https://www.mobileread.com/forums/pri...?t=19142&pp=40
I would like to include only the following links in addition to the original link.
<a class="smallfont" href="printthread.php?t=19142&pp=40&page=2 " title="Show results 41 to 80 of 193">2</a>
<a class="smallfont" href="printthread.php?t=19142&pp=40&page=3 " title="Show results 81 to 120 of 193">3</a>
<a class="smallfont" href="printthread.php?t=19142&pp=40&page=4 " title="Show results 121 to 160 of 193">4</a>
<a class="smallfont" href="printthread.php?t=19142&pp=40&page=5 " title="Show results 161 to 193 of 193">5</a>
However, the following link (self-referencing link) is always included in the printable form and it ended up included twice in the resulting LRF.
<a href="printthread.php?t=19142&pp=40">Show 40 post(s) from this thread on one page</a>
Is there a way to include this link only once in LRF?
I tried this,
Code:
web2lrf -u "https://www.mobileread.com/forums/printthread.php?t=19142&pp=40" default -r 1 -t "Reading" -a "Mobileread" --link-levels=1 --ignore-tables --match-regexp="printthread"
and this
Code:
web2lrf -u "https://www.mobileread.com/forums/printthread.php?t=19142&pp=40" default -r 1 -t "Reading" -a "Mobileread" --link-levels=1 --ignore-tables --match-regexp="printthread" --link-exclude="printthread.php?t=19142&pp=40$"
However, both of them give me the identical result. I would appreciate any pointer to improve this.