Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-17-2011, 06:08 AM   #1
DrArleigh
Junior Member
DrArleigh began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Oct 2011
Location: Toronto, Canada
Device: Archos 43
Recipe for many sites at once

I am migrating from using Plucker for a Palm TX to Calibre for Alkido on an Archos 43. I add articles from the web I want to read to my Google Bookmarks and then periodically export from Google to an HTML file.

That is what I used to feed into Plucker and it worked pretty well. I just had to make some hacks in the Plucker Python for some odd markup that caused certain sites to fail.

I am trying to do something similar with Calibre. I am working on a recipe. Getting the bookmarks file to be parsed and the links downloaded is working. But I will need to add some custom processing logic for certain sites. As a start I am trying to add logic from the english.aljazeera.net recipe to make those pages work.

Is this something others have done? I didn't see anything in this forum but it is a tricky topic to search for.
DrArleigh is offline   Reply With Quote
Old 10-17-2011, 09:30 AM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by DrArleigh View Post
I am migrating from using Plucker for a Palm TX to Calibre for Alkido on an Archos 43. I add articles from the web I want to read to my Google Bookmarks and then periodically export from Google to an HTML file.

That is what I used to feed into Plucker and it worked pretty well. I just had to make some hacks in the Plucker Python for some odd markup that caused certain sites to fail.

I am trying to do something similar with Calibre. I am working on a recipe. Getting the bookmarks file to be parsed and the links downloaded is working. But I will need to add some custom processing logic for certain sites. As a start I am trying to add logic from the english.aljazeera.net recipe to make those pages work.

Is this something others have done? I didn't see anything in this forum but it is a tricky topic to search for.
Yes. Because different sites are organized differently, it's hard to set the remove_tags and keep_tags options for all sites. You can try that approach. If that doesn't work, use preprocess_html or postprocess_html and BeautifulSoup's extract() to process each page. Or you can try auto_cleanup (read the Using recipes sticky above in this forum).
Starson17 is offline   Reply With Quote
Old 10-17-2011, 03:36 PM   #3
julio:map
Member
julio:map began at the beginning.
 
Posts: 23
Karma: 12
Join Date: Jul 2011
Device: Cool-er
You may try readitlater or instapaper recipes.

I personally use readitlater which allows me to read from my ebook or my ipad.

This days ago, there is an active thread "Instapaper - updated recipe" about using "readability" that is something you can think of as a feature included in the recipes concept, to make your recipe able to process most of the webpages (as plucker did).

I have used the same concept with readitlater, (one single line of code telling the system to use readability) and it works.
julio:map is offline   Reply With Quote
Old 10-17-2011, 04:12 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by julio:map View Post
You may try readitlater or instapaper recipes.

I personally use readitlater which allows me to read from my ebook or my ipad.

This days ago, there is an active thread "Instapaper - updated recipe" about using "readability" that is something you can think of as a feature included in the recipes concept, to make your recipe able to process most of the webpages (as plucker did).

I have used the same concept with readitlater, (one single line of code telling the system to use readability) and it works.
"auto_cleanup," which I suggested, is a relatively new feature based on code from the ReadItLater open source project. There's a link at the end of the "Using News Recipes: Start Here" sticky to a tutorial on using that feature.
Starson17 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe works when mocked up as Python file, fails when converted to Recipe ode Recipes 7 09-04-2011 04:57 AM
Ebook sites in UK James_Wilde General Discussions 1 12-29-2010 05:46 PM
Recipe for Ukrainian Economic / Legal news sites. Dereks Recipes 4 11-28-2010 06:31 PM
A Few Sites weave Deals and Resources (No Self-Promotion or Affiliate Links) 0 01-10-2006 04:44 PM


All times are GMT -4. The time now is 01:52 PM.


MobileRead.com is a privately owned, operated and funded community.