![]() |
#16 |
Addict
![]() ![]() ![]() ![]() ![]() Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
|
@Starson17: You could just look at the date of the last run and exclude all articles with date older than that - I assume at least that this was meant by 'date comparison'.
|
![]() |
![]() |
![]() |
#17 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Yes, but to do that you need to know the date of the last run, and that requires storing that date somewhere during the previous run and fetching it during the current run to do the comparison. That's exactly what my comment was about - passing data from one recipe run to the next one.
|
![]() |
![]() |
Advert | |
|
![]() |
#18 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I went back and looked at my original post. I suspect he was originally commenting on the statement that in my tests I stored article URLs for comparison. The reason I did it that way and not only by date comparison (I actually did it both ways) was to also solve the problem of duplicate articles seen in many recipes where the same article is often listed in several feeds. The feed on Energy and the feed on Politics might have the same article listed about the new Energy Bill. I was thinking about how to solve both problems simultaneously.
|
![]() |
![]() |
![]() |
#19 | |
Zealot
![]() Posts: 115
Karma: 20
Join Date: Jul 2010
Device: Kindle3 3G, Kindle Paperwhite 2
|
Quote:
Of course, the duplicate article problem cannot be solved that way. Last edited by oecherprinte; 11-16-2010 at 04:22 AM. |
|
![]() |
![]() |
![]() |
#20 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Regardless of all that, I eventually decided I didn't like stripping articles, particularly since it had to be done at the individual recipe level. You'd either have two of every recipe, or some recipes would have this and some wouldn't. A feature like this should probably be implemented at a higher level. I was interested in how it could be done, but when I looked at it closely, it wasn't a feature I actually wanted. |
|
![]() |
![]() |
Advert | |
|
![]() |
#21 | ||
Zealot
![]() Posts: 115
Karma: 20
Join Date: Jul 2010
Device: Kindle3 3G, Kindle Paperwhite 2
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#22 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Nov 2010
Device: kindle
|
I have calibre running on my server to get news every day and in works great.
As others have pointed out, it would be great to receive only feeds with new content. Could this be simply solved by hashing downloaded rss, comparing hash to a stored hash and only download and send rss to kindle if a change is detected? This sounds easy and would have great impact. |
![]() |
![]() |
![]() |
#23 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
No, you can't do it this way. Recipes vary, but most have a date or something else that changes and would change the hash. There are other ways to do it, such as date comparison and article URL history. The problem is mostly finding a developer who wants this enough to do the work.
|
![]() |
![]() |
![]() |
#24 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Nov 2010
Device: kindle
|
Ah well... I'm Java / c# developer. Out of curiosity I downloaded sources and it seems python is... well... different 8~(
Anyway, I see fetching is done in web/fetcher. Could you point me in the direction where this feature should be implemented? I'm not promising anything, but I am a dev who really wants this feature :P |
![]() |
![]() |
![]() |
#25 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Perhaps calibre/web/feeds/feedparser.py as well. One of your first questions will be where to store info for whatever kind of comparison you want to do. This is how I stored the last time a recipe of "recipe_name" ran and the URL of an article retrieved on that run: Code:
url = last_downloaded_article_url now = datetime.datetime.now() dynamic['recipe_name']['last_time'] = now dynamic['recipe_name']['last_url'] = url Code:
last_time_this_recipe_ran = dynamic['recipe_name']['last_time'] Edit: You may need to import pickle and open/load dynamic.pickle, which is where this sort of recipe related history seems to be kept. Last edited by Starson17; 11-17-2010 at 04:09 PM. |
|
![]() |
![]() |
![]() |
#26 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Nov 2010
Device: kindle
|
Ok, this doesn't seem to be so hard, at least to some degree
I added unique_id to Article class: def unique_id(self): md5=hashlib.md5() md5.update(self.id) md5.update(self.title) md5.update(self.url) return md5.hexdigest() Then, in news.py after line 920 I added: if a==0: last_article=dynamic['recipe_'+self.title+'last_article'] if last_article is not None: #print last_article if last_article==article.unique_id(): print " Nothing to do" raise ValueError('No articles found, aborting') dynamic['recipe_'+self.title+'last_article']=article.unique_id() The good: yeey, if the last article is the last article we downloaded processing is aborted The bad: I see no other (easy) option than throwing an exception, which of course propagates to UI. Any idea where / how to silently handle it? |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to transfer only previously unretrieved RSS posts? | lymeswold | Recipes | 10 | 10-14-2010 07:37 PM |
Is there a good way to convert partial rss to full rss feeds. | Zorz | Other formats | 5 | 05-29-2010 12:17 PM |
RSS feeds | peejay | PocketBook | 2 | 04-26-2010 05:16 AM |
RSS feeds | ichor | iRex | 1 | 03-01-2008 11:30 PM |