View Single Post
Old 07-09-2013, 11:42 AM   #11
dkfurrow
Member
dkfurrow began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Jun 2013
Device: LG G-Pad 8.3
Latest Chronicle Recipe attached.

So, I think the probable answer to my last post is "can't be done"...i.e. if you want to exclude an article, you have to make sure that it doesn't get returned in parse_index.

Latest houston chronicle is attached--I'm comfortable with this as a submission for the next build. It's somewhat slow (>4mins on my machine), because it's parsing all article pages (with lxml) in parse_index in order to populate metadata and remove old articles. It does seem strange to me that that the date argument for the Article constructor doesn't appear to populate the finished date in the ebook--had to revisit Article.date in populate_article_metadata.

I see that the API allows saving content to a temporary file, and there's an example in LeMonde. If I have time I'll see if I can figure out how to apply that here...might speed things up a bit, but unclear to me how embedded pictures will be handled.

Would be happy to take any suggestions for improvement.
Thanks,
Dale
Attached Files
File Type: zip houston_chronicle.zip (2.8 KB, 214 views)
dkfurrow is offline   Reply With Quote