Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 09-03-2012, 07:27 AM   #1
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
Question Recipe without rss feed?

Hi!

There is a webpage (URL) that gets updated daily. There is no rss feed.

Basically I would like to write a recipe that grabs the contents of that single URL and formats it into an epub for me once a day.

My first approach, to just add the URL to the feeds variable gives me an epub with all the contents of the URL including its html tags - with no formatting at all.

I flipped through the API documentation but only found the 'use_embedded_content' variable, which might be the correct direction? However I feel being trapped by having the URL content interpreted as rss and not as "news content".

Any clue how to process webpages without the help of rss feeds?
Thank you!
nicolash is offline   Reply With Quote
Old 09-04-2012, 09:50 AM   #2
Steven630
Zealot
Steven630 began at the beginning.
 
Posts: 142
Karma: 10
Join Date: May 2012
Device: Kindle Paperwhite2
So, you mean you want to download ONLY the starting page? (against using the page as TOC)
Steven630 is offline   Reply With Quote
Old 09-04-2012, 09:26 PM   #3
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 780
Karma: 194642
Join Date: Dec 2007
Location: Argentina
Device: Kindle PaperWhite, Motorola Xoom
You should be able to do this with instapaper. Just create accaount and add that url to your main feed. The existing recipe should do the rest.
kiklop74 is offline   Reply With Quote
Old 09-05-2012, 02:12 AM   #4
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
Yes, that is the idea.
The single webpage contains all the contents (but no toc and neither a rss data).
nicolash is offline   Reply With Quote
Old 09-05-2012, 02:15 AM   #5
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
Thank you kiklop74,

Using Instapaper works and brings the page as an epub to my reader.

However starting to learn how calibre recipes look like, I would like to go on and try to figure out how I could do processing of the page by myself - that way I might be able to insert special formatting or sectioning, maybe even a table of contents.
nicolash is offline   Reply With Quote
Old 09-05-2012, 02:29 AM   #6
yosivd
Junior Member
yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.yosivd ought to be getting tired of karma fortunes by now.
 
Posts: 3
Karma: 497132
Join Date: Sep 2012
Device: ipad
Quote:
Originally Posted by kiklop74 View Post
You should be able to do this with instapaper. Just create accaount and add that url to your main feed. The existing recipe should do the rest.
Had the same problem.
Thanks
yosivd is offline   Reply With Quote
Old 09-05-2012, 01:07 PM   #7
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 204
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
feed43 (feed for free) is pretty good.
scissors is offline   Reply With Quote
Old 09-07-2012, 12:38 AM   #8
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
Idea for a direct methof of accessing an static URL?

Okay, thanks - the idea to create a rss feed by a service or even locally on my computer might work.

However, there must be a way to do it directly wih the recipes?
I did not have much time to dig into this forum. But maybe the BeautifulSoup might be something? Or is this used for handling after a rss feed helped to get to some content page?
nicolash is offline   Reply With Quote
Old 09-07-2012, 12:45 AM   #9
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
The main issue I have with Instapaper is, that I have to visit the page manually each time I would like to get the actual copy. Without this step, I receive with every download the same content. So I guess feed43 is the better work-around for my problem. Will give feedback, after I put it to work.
nicolash is offline   Reply With Quote
Old 09-07-2012, 06:27 AM   #10
eroche
Junior Member
eroche began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2012
Device: sony ereader
Nicolash I was trying to do the same thing the other day. You can do what your asking using a basicnewsrecipe class and the parse_index function. The following should get you started
Spoiler:
from calibre.web.feeds.news import BasicNewsRecipe

class BBC(BasicNewsRecipe):
title = 'Latest Headlines, Business and Sport from RTE (Ireland)'
def parse_index(self):
url = 'http://www.bbc.com/'
feeds = []
articles = [({'title': 'BBC Web', 'url': url, 'description':'', 'date':''})]
feeds.append( ( 'BBC Website', articles ) )
return feeds


Basically all you need to do is inherit a BasicNewsRecipe class and then initialise a feeds variable with the correct paramaters, calibre does the rest
eroche is offline   Reply With Quote
Old 09-09-2012, 06:35 AM   #11
nicolash
Junior Member
nicolash began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2012
Device: Sony PRS-T2
Thanks, eroche!

Your code snippet really gets me started.

Almost everything I had to do to obtain the page as epub with proper formatting was to exchange the URL!

Will start playing around with this some more.
nicolash is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RSS FEED/ RECIPE for 365tomorrows.com earl412 Recipes 9 06-29-2012 01:55 PM
Request: small recipe that adds borders to a borderless table inside an RSS feed mopol Recipes 0 03-01-2012 03:26 PM
Recipe for german RSS feed "Leipziger Volkszeitung" a.peter Recipes 0 09-28-2011 03:05 AM
RECIPE Request: MLB.COM RSS Feed fung Recipes 0 03-26-2011 11:42 PM
RSS Feed timezone Feedback 8 01-02-2010 06:55 PM


All times are GMT -4. The time now is 07:50 AM.


MobileRead.com is a privately owned, operated and funded community.