MobileRead Forums - View Single Post

adinb · 03-16-2007, 03:20 AM

Has anyone else had problems getting the current version of web2book (v23, i believe) to "Apply extractor to linked content instead of link text"?

Here's the deets:

Code:

     URL: http://www.abqtrib.com/feeds/headlines/
     Link Element: link
     (apply extractor to linked content)
     Link Reformatter: {0}?printer=1/

So, I'm just appending "?printer=1/" to the original link found in the link element to try and make it go to the printer friendly page. Even though the log shows the link formatter coming up with the correct "printer friendly" links, the pdf output is the linked page. (example: the content of http://abqtrib.com/news/2007/mar/15/...th-dwi-charge/ is ending up in the pdf instead of http://abqtrib.com/news/2007/mar/15/...ge/?printer=1/)

This is all using the test function, so I haven't *absolutely* verified what will be put on my reader. But this really looks like it's not following the reformatted link. If there's a different preferred way of doing this (maybe something with the link extractor pattern?) I'd love to hear it. (I can probably extract text using the content reformatter, but then I miss small graphics accompanying the stories in print mode)

Log output (abbreviated):

Quote:

Processing Albuquerque Tribune Today
Got link from RSS: http://abqtrib.com/news/2007/mar/15/...th-dwi-charge/
Thu, 15 Mar 2007 22:05:00 -0000 is in range

Done link extraction{0} = http://abqtrib.com/news/2007/mar/15/...th-dwi-charge/
Reformatted link is http://abqtrib.com/news/2007/mar/15/...ge/?printer=1/

HTML of "normal" page follows (vice printer friendly page)

EDIT: Same problem reproduced multiple times, like on the The Reg, etc.

Also, there seems to be some problem using the "link" element on RSS .91 and ATOM feeds.

EDIT2: There also seems to be something funky going on evaluating regexp's with logical "OR"s in the ( this | that)