View Single Post
Old 10-17-2011, 09:30 AM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by DrArleigh View Post
I am migrating from using Plucker for a Palm TX to Calibre for Alkido on an Archos 43. I add articles from the web I want to read to my Google Bookmarks and then periodically export from Google to an HTML file.

That is what I used to feed into Plucker and it worked pretty well. I just had to make some hacks in the Plucker Python for some odd markup that caused certain sites to fail.

I am trying to do something similar with Calibre. I am working on a recipe. Getting the bookmarks file to be parsed and the links downloaded is working. But I will need to add some custom processing logic for certain sites. As a start I am trying to add logic from the english.aljazeera.net recipe to make those pages work.

Is this something others have done? I didn't see anything in this forum but it is a tricky topic to search for.
Yes. Because different sites are organized differently, it's hard to set the remove_tags and keep_tags options for all sites. You can try that approach. If that doesn't work, use preprocess_html or postprocess_html and BeautifulSoup's extract() to process each page. Or you can try auto_cleanup (read the Using recipes sticky above in this forum).
Starson17 is offline   Reply With Quote