|
|
#1 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Do recipes use a cache?
I'm working on a recipe using parse_index and soup to read a page at a URL that never changes. That first page has a link to a second page. The second page has a link to a third page, etc.
These pages contain the content (articles) that I want, as well as the links I want for the articles. I grab the first page, create the article link for that page and the article link for page 2 from the data on page 1. Then I read page 2 into BeautifulSoup, find the link for page 3 and stick that into my index, etc. At this point everything is great. I've got my parsed index, and if I let it run, I get the content I want from my parsed index, just as if it had been read from an RSS feed. However, trouble rears its head when I try to modify the pages with preprocess_html or use preprocess_regexps. It sort of looks like it's pulling the pages (that I've already downloaded to build my article/feed index) from a cache, instead of modifying them with preprocess_html before grabbing them. Has anyone seen this interaction, or have suggestions for dealing with this? Thanks. |
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,617
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No, they don't.
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Need help on Sony cache.xml | janpardo | Reading and Management | 0 | 05-24-2010 09:22 AM |
| Clearing books in cache | Slash5 | Ectaco jetBook | 3 | 12-18-2009 03:08 AM |
| PRS-500 layout values in cache.xml | kenbaldwin | Sony Reader Dev Corner | 12 | 03-03-2009 08:02 PM |
| Relationship between multiple cache files | pepak | Sony Reader Dev Corner | 1 | 09-12-2008 06:29 AM |
| Pre-render and cache PDF pages? | nekokami | iRex | 3 | 07-02-2008 04:26 AM |