10-07-2012, 08:22 AM | #1 |
Addict
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
|
Articles repeated in different feed sections
Like others i'm aware some sites have links to the same article in more than one feed.
here is an attempt at sorting this, based on the ideas in the re-usable section. Please bare in mind I am no programmer and have to google python examples to make this. So it's clunky and crude. The basic idea is. repeat Open a txt file. Get feed url Is the article title in the txt file? No - it's unique, download it append article title to txt file. Yes - it must be in a previous section don't download it don't append it to the file (it's already in there) until no more articles. here it is implemented in the bbc nature recipe (which always has repeats) I've also tried it in Country file - this also seems to work. Spoiler:
|
10-07-2012, 08:30 AM | #2 |
creator of calibre
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
IIRC there's a technique for doing this in the recipes -reusable code sticky thread at the top of this forum.
https://www.mobileread.com/forums/sho...5&postcount=10 |
Advert | |
|
10-07-2012, 08:48 AM | #3 | |
Addict
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
|
Quote:
Hi Kovid, I tried it. It checks for an article in a section being downloaded in the past. It creates a file for each section. so if article x is downloaded in section 1 it will not be downloaded into section 1 ever again. However, if article x is in section 1, in this months mag, it can still be in section 2 in the same month. Same article in 2 sections. My way means no article is repeated in any section within a single download. (at least that's what happened for me) I hope that's the case cus that took me all morning (i'm not a programmer) Last edited by scissors; 10-07-2012 at 09:40 AM. |
|
10-07-2012, 12:44 PM | #4 |
creator of calibre
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Do you want duplicates to be removed across downloads or only across feeds in a single download? If the latter, then you dont need to use a file, just store the titles in memory.
And generally speaking, you should use URLs as the key to check for duplicates, not the title. Less likely to have false duplicates that way. |
10-07-2012, 01:06 PM | #5 | |
Addict
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
|
Quote:
Yes i realise there is a risk of duplicates and URLS get used. I used a file because it was a way i could achieve what i wanted. The recipes I knock together are the whole of my python knowledge and experience. I am not a programmer - feel free (anyone) to make them better. |
|
Advert | |
|
10-07-2012, 01:21 PM | #6 |
creator of calibre
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I added an API to the BasicNewsRecipe class to do this. See http://bazaar.launchpad.net/~kovid/c...revision/13440
I haven't tested it, you can do that yourself after the next calibre release. All you need to do is add ignore_duplicate_articles = {'title'} to your recipe. |
10-17-2012, 01:54 PM | #7 |
Member
Posts: 24
Karma: 142
Join Date: Sep 2010
Device: K3, KPW
|
Hi Kovid, Scissors,
I added Code:
ignore_duplicate_articles = {'title', 'url'} Kovid, Thanks for giving us this useful addition; and of course THANKS for Calibre. Last edited by _reader; 10-17-2012 at 03:24 PM. Reason: edit |
10-19-2012, 11:02 AM | #8 | |
Addict
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
|
Quote:
oh well, that's progress |
|
10-19-2012, 11:27 AM | #9 |
creator of calibre
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Writing your own code is the best way to learn. That's how I learned to program
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Fix for Kindle 3 problem with Sections and Articles | nickredding | Development | 22 | 08-19-2011 06:58 PM |
Show articles only, hide sections | sjk9 | Recipes | 1 | 04-11-2011 11:04 AM |
Create Article Sections From Feed? | Finbar127 | Recipes | 5 | 02-26-2011 08:55 AM |
Insert Hyperlinks in Feed Articles | Bushwil | Recipes | 1 | 01-21-2011 02:51 PM |
Sorting articles of RSS feed | miwie | Recipes | 1 | 11-21-2010 01:02 AM |