Kovid,
I noticed that web2lrf ignores/deletes words entirely that have underlying links. This makes some articles a little hard to understand since key words are sometimes left out.
As an example, in the following article the names "David Beckham," "Adidas," and "Pepsi" are all deleted/ignored when it is converted to an lrf.
http://www.nytimes.com/2007/11/17/bu...gewanted=print
I noticed the same thing happens when downloading the html file and running it through html2lrf. I've attached the lrf I generated as an example.
Is there something about linked text that makes it difficult to parse? Or is this simply a bug that needs to be eliminated?
Thanks a lot for your help.
BTW, still trying to get some profiles made. Not knowing Python is proving to be a rather large stumbling block, however.