05-27-2011, 01:03 PM | #1 |
Zealot
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
get print-url and somtimes non-print-url
hi,
i think the code is correct but sometimes the same articel is present at non print-url. in fetch details i can see that only print-url is downloaded. any suggestions Code:
title = u'Focus.de - online1' __author__ = 'schuster' oldest_article = 5 max_articles_per_feed = 5 no_stylesheets = True use_embedded_content = False language = 'de' remove_javascript = True # recursion = 0 def get_article_url(self, article): raw = article.get('guid', None) final = raw + '?drucken=1' return final feeds = [(u'Focus online', u'http://rss2.focus.de/c/32191/f/443315/index.rss')] |
05-27-2011, 02:02 PM | #2 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Your question isn't clear to me. Your code adds '?drucken=1' to the article URL (retrieved by article.get('guid', None)) with '?drucken=1' appended to the end. I assume that pulls the print version. It will always try to get that URL with '?drucken=1' appended to the end. Do you want it to do something else? If so, when?
|
Advert | |
|
05-27-2011, 02:07 PM | #3 |
Zealot
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
hi starson,
no, what i mean is that the right articels where fetched (printversion). but also the same articels as non-printversion. i don't know why. the return of get_article_url IS the prinversion but in the mobi-file i have both print and non-print. |
05-27-2011, 02:14 PM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Have you gone to that URL (the URL that has '?drucken=1' at the end, but shows up as both the print and non-print version) to see what appears at that URL? If it looks like just the print version in your browser, look at the code for the page to see if there's something tricky happening.
|
05-28-2011, 03:01 AM | #5 |
Zealot
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
o.k. i have checked it all and....
there are a few articels twice in the feed. they have the same name (url) and the same quellcode. is it possible that the fetch-engine get worry about this? one of this articles is in print-version, the other one is in web-style with all links and ad's after download. as i have say there are no errors in log and the fetch url is always the print-version. |
Advert | |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Need Help Splitting a Print URL ... easy stuff. HELP! | mjcassel | Recipes | 2 | 11-25-2010 09:30 AM |
Print vs Pixel: retailers experiment with print/ebook bundles | DMcCunney | General Discussions | 42 | 09-15-2010 11:29 AM |
Suggestion: add from URL | pedz | Calibre | 1 | 04-08-2010 10:34 PM |
URL for GutenText please | AlexBell | Workshop | 2 | 06-26-2009 01:18 AM |
Replacing Chars in URL | DAiki | Calibre | 5 | 10-13-2008 09:25 AM |