MobileRead Forums - View Single Post

JTravers · 11-21-2007, 04:45 AM

Kovid,
I noticed that web2lrf ignores/deletes words entirely that have underlying links. This makes some articles a little hard to understand since key words are sometimes left out.

As an example, in the following article the names "David Beckham," "Adidas," and "Pepsi" are all deleted/ignored when it is converted to an lrf.
http://www.nytimes.com/2007/11/17/bu...gewanted=print

I noticed the same thing happens when downloading the html file and running it through html2lrf. I've attached the lrf I generated as an example.

Is there something about linked text that makes it difficult to parse? Or is this simply a bug that needs to be eliminated?

Thanks a lot for your help.

BTW, still trying to get some profiles made. Not knowing Python is proving to be a rather large stumbling block, however.