Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-02-2012, 09:28 AM   #1
cornfieldcraig
Member
cornfieldcraig began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Sep 2011
Location: Chicago, Illinois, USA
Device: Nook Simple Touch
Chicago Tribune Recipe appears broken

It looks like the Chicago Tribune has added a step to viewing the initial page when following a feed. A page displays with the following message: "click here to continue to article" in multiple languages. If you click that link it proceeds normally. This seems to happen once per browser, so I'm guessing it's creating a cookie. Or, if you don't click, after a wait of perhaps 30 seconds, the page automatically proceeds. Unfortunately, it seems that the current Chicago Tribune recipe is not presently equipped to handle this new speed bump. As a result, every article in the converted text is simply the link and no article.
cornfieldcraig is offline   Reply With Quote
Old 02-02-2012, 12:02 PM   #2
cornfieldcraig
Member
cornfieldcraig began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Sep 2011
Location: Chicago, Illinois, USA
Device: Nook Simple Touch
I see that now that when you follow some links in a Chicago Tribune RSS, there are a few pages of ads to wait or click through. At this moment, I'm not seeing the "click here to continue to article" page, but I'm on a different PC with a different browser, et al, so it's hard to say for sure whether it's the content or just the environment that has changed from earlier today.
cornfieldcraig is offline   Reply With Quote
Advert
Old 02-02-2012, 09:51 PM   #3
cornfieldcraig
Member
cornfieldcraig began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Sep 2011
Location: Chicago, Illinois, USA
Device: Nook Simple Touch
Here's the source of the interim page that's causing the problem:


<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<html>

<head>

<title>Advertisement</title>

<style>

A {

color: gray;

font-family: Arial;

font-size: 10pt;

font-weight: bold;

}

</style>

</head>

<body onload="setTimeout( 'location.href = \'http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss\'',18000);" ><div align="right"><p style="width: 250px; align:left; text-align:left; color: gray;"><a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">click here to continue to article</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">cliquez ici pour lire l'article</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">weiter zum Artikel</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">clicca qui per visualizzare l'articolo</a>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">weiter zum Artikel</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">ir a la noticia</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">klik hier om door te gaan naar het artikel</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">Yazıya devam etmek i&#xE7;in tıklayın</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">Перейти к статье</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">继续阅读文章,请点击这里</a><br>

<a href="http://www.chicagotribune.com/news/chi-rose-voted-allstar-starter-20120202,0,7880501.story?track=rss">Tovább a cikkre</a>

</p></div>

<div align="center"><div align="center">
<SCRIPT language='JavaScript1.1' SRC="http://ad.doubleclick.net/adj/N3867.289335.MEDIAFED.COM/B6175432.2;sz=300x250;pc=[TPAS_ID];click=http://da.feedsportal.com/c/34253/f/622809/s/1c5c323e/l/0L0Schicagotribune0N0Cnews0Cchi0Erose0Evoted0Ealls tar0Estarter0E20A120A20A20H0A0H7880A50A10Bstory0Dt rack0Frss/iac.htm?cp_lnk=7824_;ord=1472651?">
</SCRIPT>
</div><script language="javascript">
document.write( "<img src=\"http://da.feedsportal.com/c/34253/f/622809/camp/7824/iad.gif\" />" );
</script></div>

</body>

</html>

Last edited by cornfieldcraig; 02-02-2012 at 10:01 PM.
cornfieldcraig is offline   Reply With Quote
Old 02-02-2012, 10:01 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,196
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Try adding this to the recipe:
Code:
    def skip_ad_pages(self, soup):
        text = soup.find(text='click here to continue to article')
        if text:
            a = text.parent
            url = a.get('href')
            if url:
                return self.index_to_soup(url, raw=True)
kovidgoyal is offline   Reply With Quote
Old 02-02-2012, 10:43 PM   #5
cornfieldcraig
Member
cornfieldcraig began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Sep 2011
Location: Chicago, Illinois, USA
Device: Nook Simple Touch
Thanks Kovid. Worked like a charm.
cornfieldcraig is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Chicago Tribune Recipe not selecting full article cornfieldcraig Recipes 3 09-29-2011 02:31 AM
New Recipe - Wyoming Tribune Eagle Online Tegan Recipes 0 02-12-2011 01:54 PM
Chronicle Tribune recipe help madman911 Recipes 0 01-29-2011 11:33 PM
Fetch Hartford Courant based on Tribune recipe Being Calibre 6 12-27-2009 09:54 AM
Chicago Tribune now available on the Kindle! daffy4u Amazon Kindle 14 08-11-2008 01:10 PM


All times are GMT -4. The time now is 05:14 PM.


MobileRead.com is a privately owned, operated and funded community.