Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-29-2015, 05:39 PM   #1
thorgan
Junior Member
thorgan began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2015
Device: Kindle
Download url and links by recipe so readability version made

Dear Group,

Thanks so much for a) existing b) reading this post at all and c) having patience with me.

I hope I'm not duplicating this request. I did try a good few searches but in the end decided to join the community and ask.

I'd like to make an ebook from a url where the page is grabbed but where it also follows the links (e.g. http://markforster.squarespace.com/b...e-systems.html or http://www.psychowith6.com/can-a-dai....Z8UQS2kE.dpbs)

I know I can do this via ebook-convert, but what I'm keen to do is to try and do it via a recipe so that I can use the readability aspects and have it so the ebook only contains the 'body'.

I know a little python, and next to nothing in html, but I'm keen to try (for the achievement if nothing else). I'm aware/have had a once through of these links: https://www.mobileread.com/forums/sho...d.php?t=121439, http://blog.calibre-ebook.com/2011/1...-fetching.html, http://manual.calibre-ebook.com/news...asicNewsRecipe, http://manual.calibre-ebook.com/news...-fetch-process.

I think the key API methods are: extract_readable_article(html, url), is_link_wanted(url, tag) or the regexp options for tags, parse_index(), auto_cleanup (maybe? I think that's just for feeds?) and recursions = X so it follows links.

I've made a basic start that doesn't throw errors but does little else (and index.html is downloaded) but I'm lost after that. Things like if I use extract_readable_article - can I assume the html, url are somehow already known or is that up to me?

Any help or pointers appreciated.

Kind regards,
Tim
thorgan is offline   Reply With Quote
Old 11-29-2015, 10:23 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,839
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You dont need readable_article.

First make sure auto_cleanup = False

Then set recursions = 1

Then implement a dummy is_link_wanted that always returns True.

Once you have the links being picked up, you can look into the cleanup tools the recipe system offers.
kovidgoyal is offline   Reply With Quote
Advert
Old 12-08-2015, 08:00 AM   #3
thorgan
Junior Member
thorgan began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2015
Device: Kindle
Thank you Kovid,

Sorry I meant to reply sooner - I have spent a few evenings trying to work out what I'm doing based on your pointers. I'm getting somewhere (though in the end its not actually that many lines I've written - just trying to understand which functions I thought I needed) but I've more to go.

Whether I get somewhere or get stuck, I'll post what I've got later so others can see and help (potential help to them).

Thanks,
Tim
thorgan is offline   Reply With Quote
Reply

Tags
extract_readable_article, links, webpage

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Touch Normal links becomes footnotes links in epub made with Calibre il_mix Kobo Reader 15 08-10-2014 01:19 PM
Opening URL links BetterRed Editor 3 05-10-2014 02:37 AM
Use links in Calibre comments with custom URL schemes (e.g., DEVONthink) on Mac OS X Januz Calibre 2 01-26-2014 06:07 PM
Request: recipe for Readability.com mojofleur Recipes 2 08-10-2013 04:10 AM
Simple download from rss url recipe BloodOmen Recipes 0 02-16-2011 09:21 PM


All times are GMT -4. The time now is 04:43 PM.


MobileRead.com is a privately owned, operated and funded community.