hi alex,
still loving this a lot.
Quote:
Originally Posted by Alexander
The refresh rate is the only reason for having message gaps since only the latest 40 posts are returned each time you query the updated feed.
|
i meant more that it is of secondary importance to receive the message soon after it is posted, while it's of primary importance to receive all messages eventually.
i wonder if there's a way we can think of to architect a solution. i only track 5 or so boards, but regularly miss messages from xmsr, a very popular board (used to be even more so, with 3 posts a minute regularly a while back).
i assume you cache only the headlines that you've scraped (for lack of a better word) for a particular board. and i assume you only cache those boards for whom there is a subscriber. could there not be a way to scrape only those headlines starting from the last message# retrieved instead of last 40?
an if there is no 'protocol' way to request messages from #X on, can you not then go back until you've gotten a page with the next # after your most recent one?
your yahoo server hitting frequency would not have to change, and in fact could increase; net efficiency would be actually much greater because you're now skewed to duplicating retrieval requests from infrequent boards. you would then merely have to store the headlines (potentially a lot more, but it's not a large amount of data per) for let's say a week, and then roll them off.
apologies if i've assumed incorrectly how you've implemented it or you have thought about all this already and it can't be done!
(in another time and place and life i used to do nothing but worry about this very issue