Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 08-05-2010, 11:18 AM   #2386
cisaak
Member
cisaak began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Aug 2010
Device: Kindle DX
Quote:
Originally Posted by Elevatorguy View Post
It would be great if someone could make a recipe for the show me city St. Louis Post-Dispatch & STLtoday.com, I use Instapaper to load individual articles, thanks.
I second this request.
cisaak is offline  
Old 08-05-2010, 03:27 PM   #2387
AGB
Headbutting stupidity
AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.
 
AGB's Avatar
 
Posts: 1,703
Karma: 2526196
Join Date: Aug 2010
Location: Greater Cph
Device: PRS650
I'd so like a proper recipe for dr.dk/nyheder (Danish Broadcast Corporation). Preferably without pictures.

When I try to do it myself (the simple version), I find there's a lot of scrolling before I get to the contents.

I looked through this thread, searched it, and I couldn't find it. I hope you'll be able to help out.
AGB is offline  
Advert
Old 08-05-2010, 03:57 PM   #2388
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
New recipe for Danish news portal dr.dk:
Attached Files
File Type: zip dr.dk.zip (1.1 KB, 239 views)
kiklop74 is offline  
Old 08-05-2010, 04:06 PM   #2389
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
New Recipe: Snopes

New recipe for snopes.com.
Attached Files
File Type: zip Snopes.zip (786 Bytes, 239 views)
Starson17 is offline  
Old 08-05-2010, 07:32 PM   #2390
chris2x
Member
chris2x began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Aug 2010
Location: Philippines
Device: iPod Touch + Kindle software, Kindle Wifi+3G
how about a recipe for www.abs-cbnnews.com?

i've been poring through the site and saw that some of the feeds aren't immediately visible on the page, and all of the major items on the site are feed-powered. the feeds i saw (might miss a few - but would be easy to add once we have the base recipe in) are listed below. i also sorted them in the way i think it should look like in a "daily".

Metro
http://www.abs-cbnnews.com/nation/metro-manila/feed
Nation
http://www.abs-cbnnews.com/nation/feed
World News
http://www.abs-cbnnews.com/world/feed
Insights
http://www.abs-cbnnews.com/views-and-analysis
Region
http://www.abs-cbnnews.com/nation/region/feed
Business
http://www.abs-cbnnews.com/business/feed
Tech-Biz
http://www.abs-cbnnews.com/business/tech-biz/feed
MoneySense
http://www.abs-cbnnews.com/business/moneysense/feed
Entertainment
http://www.abs-cbnnews.com/entertainment/feed
Lifestyle
http://www.abs-cbnnews.com/lifestyle/feed
Youth
http://www.abs-cbnnews.com/lifestyle/youth/feed
Technology
http://www.abs-cbnnews.com/technology/feed
Sports
http://www.abs-cbnnews.com/sports/feed
Classified Odd
http://www.abs-cbnnews.com/classified-odd/feed

there's a link at the bottom of every article pointing to the printer-friendly format, however it's "coded" in a numeric ID so i'm not sure if it's possible to bring that in as the actual source for the article. hope someone could help... thanks!
chris2x is offline  
Advert
Old 08-05-2010, 07:45 PM   #2391
mayer006
Junior Member
mayer006 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: nook
Can you please help me with this one. I can build a basic one but is not very clean. http://www.debka.com
mayer006 is offline  
Old 08-05-2010, 07:47 PM   #2392
mayer006
Junior Member
mayer006 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: nook
Can you please help me with this one. I can build a basic one but is not very clean. http://www.debka.com . The RSS feed is http://www.debka.com/feeds/latest/
mayer006 is offline  
Old 08-05-2010, 08:24 PM   #2393
AGB
Headbutting stupidity
AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.AGB ought to be getting tired of karma fortunes by now.
 
AGB's Avatar
 
Posts: 1,703
Karma: 2526196
Join Date: Aug 2010
Location: Greater Cph
Device: PRS650
Quote:
Originally Posted by kiklop74 View Post
New recipe for Danish news portal dr.dk:
Oh, damn that was fast. Thank you so much!
AGB is offline  
Old 08-06-2010, 01:41 AM   #2394
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Atlanta Journal Constitution

Would anyone of the recipe Guru's out there be kind enough to apply their skills to www.ajc.com
I would hope to have it in table of contents like local/metro/ so on. However I will gladly take whatever you wish to make.
Thanks for your time. Really wish I understood the recipe making of this but tried it and didn't understand it so I figured let you guys do it that know what your doing
TonytheBookworm is offline  
Old 08-06-2010, 05:27 AM   #2395
eric11210
Addict
eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.eric11210 ought to be getting tired of karma fortunes by now.
 
Posts: 295
Karma: 400001
Join Date: Aug 2010
Device: Sony Reader PRS-600
Hi folks, I know there is a place somewhere on the Calibre site to post custom recipes that people created, but I haven't found it. The following two I thought may be useful to some. They're not fancy and are the same kind anyone could make with a few minutes of effort since they don't make use of any fancy Python scripting, however I thought some may still find them useful just so they don't have to bother tracking down the RSS feeds:

Jewish Daily Forward:

class AdvancedUserRecipe1281086122(BasicNewsRecipe):
title = u'Jewish Daily Forward'
oldest_article = 7
max_articles_per_feed = 100

feeds = [(u'News', u'http://www.forward.com/rss/news/'), (u'Editorial', u'http://www.forward.com/rss/editorial/'), (u'Letters', u'http://www.forward.com/rss/letters/'), (u'The Blogs', u'http://blogs.forward.com/rss/'), (u'Arts and Culture', u'http://www.forward.com/rss/arts-and-culture/'), (u'Books', u'http://www.forward.com/rss/books/'), (u'Looking Back', u'http://www.forward.com/rss/looking-back/')]

LGBT News Update (Aggregates from several different free sources:

class AdvancedUserRecipe1281086687(BasicNewsRecipe):
title = u'LGBT News Updates'
oldest_article = 7
max_articles_per_feed = 100

feeds = [(u'The Advocate', u'http://www.advocate.com/rssFeeds.aspx?fid=7'), (u'Washington Blade', u'http://www.washingtonblade.com/feed/'), (u'Lambda Literary Book Reviews', u'http://feeds.feedburner.com/lambdaliterary_org'), (u'Out There Travel News', u'http://feeds.feedburner.com/OutThere-GaycitiesTravelBlog?format=xml')]




And anyone with more programming experience than I have (I've dabbled and I suppose I could figure out how to use Python, but never have done so before and expect it would take me a while to do), on the LGBT news, the Lambda Literary and Out There seem to download with additional, unneeded formatting (well, Lambda Literary does. Out There just appends the comment section of the blog which is pretty useless), so if someone wants to try to clean it up, I'd be very grateful.

In any event, hope these are helpful to some of you.

Eric
eric11210 is offline  
Old 08-06-2010, 12:51 PM   #2396
Flexicat
Junior Member
Flexicat began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2010
Device: Kobo
Hello all,

Can I have some help in refining my recipe?

I've created this one with help from this forum, but I do not think that the print version switch is working.

I have been able to trim down the content I do not want to see to manageable levels, but my script is only collecting the first page of each article.

The article address format is like this:

Code:
http://www.tthfanfic.org/Story-22821/Snag+Strangely+Literal.htm
with the print-version address format like this:

Code:
http://www.tthfanfic.org/wholestory.php?no=22821&format=print
I am pretty sure the code is correct, but it does not seem to be switching to the print version. At the end of each article in my epub, Calibre has a line that says "article downloaded from" and the article address instead of the print version address.

I have tried turning on and off Javascript, with exactly the same results.

Here is the code:
Spoiler:

Code:
class AdvancedUserRecipe1280965027(BasicNewsRecipe):
    title          = u'TTH'
    oldest_article = 7
    max_articles_per_feed = 30
    no_stylesheets        = True
    encoding              = 'UTF-8'
    remove_javascript     = True
    use_embedded_content  = False

    keep_only_tags = [
                    dict(attrs={'class':'storysummary formbody defaultcolors'})
                   ,dict(attrs={'class':'storybody defaultcolors'})
                  ]

    feeds          = [(u'Latest Stories', u'http://www.tthfanfic.org/rss.php')]


def print_version(self, url):
    split1 = url.split("/")
    xxx = split1[3]
    split2 = xxx.split("-")
    artid =  split2[1]
    print 'artid is: ', artid
    return 'http://www.tthfanfic.org/wholestory.php?no=' + artid + '&format=print'


Thank you
Flexicat is offline  
Old 08-07-2010, 09:21 AM   #2397
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Flexicat View Post
Can I have some help in refining my recipe?
Sure

Quote:
I've created this one with help from this forum, but I do not think that the print version switch is working.
It works great when it runs. It isn't running.

Quote:
I have been able to trim down the content I do not want to see to manageable levels, but my script is only collecting the first page of each article.
If you run the print version, you get different content than if you don't run it. The page problem and removal of excess junk are problems that aren't related to the print version.

Quote:
Code:
http://www.tthfanfic.org/Story-22821/Snag+Strangely+Literal.htm
with the print-version address format like this:

Code:
http://www.tthfanfic.org/wholestory.php?no=22821&format=print
I am pretty sure the code is correct, but it does not seem to be switching to the print version.
Here is the code:
Spoiler:

Code:
class AdvancedUserRecipe1280965027(BasicNewsRecipe):
    title          = u'TTH'
    oldest_article = 7
    max_articles_per_feed = 30
    no_stylesheets        = True
    encoding              = 'UTF-8'
    remove_javascript     = True
    use_embedded_content  = False

    keep_only_tags = [
                    dict(attrs={'class':'storysummary formbody defaultcolors'})
                   ,dict(attrs={'class':'storybody defaultcolors'})
                  ]

    feeds          = [(u'Latest Stories', u'http://www.tthfanfic.org/rss.php')]


def print_version(self, url):
    split1 = url.split("/")
    xxx = split1[3]
    split2 = xxx.split("-")
    artid =  split2[1]
    print 'artid is: ', artid
    return 'http://www.tthfanfic.org/wholestory.php?no=' + artid + '&format=print'
Here are your problems:
1) you need to indent the print_version and each line under it by 4 more spaces. Until you do that, it's not a part of the class.
2) when you indent it, it will run and you'll get no results. Your recipe has keep_only_tags that don't exist on the print page. Since you've said to keep only things that aren't there, you get nothing. Remove the keep_only_tags.
3) You can remove print 'artid is: ' after testing. (it worked for me)

Last edited by Starson17; 08-07-2010 at 09:24 AM.
Starson17 is offline  
Old 08-07-2010, 10:25 AM   #2398
js4c
Member
js4c began at the beginning.
 
js4c's Avatar
 
Posts: 11
Karma: 10
Join Date: Jul 2010
Device: Nook & Novel
Hello. Can I get some help with a recipe for the Rochester Democrat and Chronicle? The problem I am having with the recipe below is the title of the article does not appear in the article itself, only in the calibre generated link. There is a print version of the feed but that excludes pictures which I am also trying to capture. I started this recipe using the basic template, which formats everything fine, but takes an incredibly long time to process the style sheets. This version runs in about 1/10th the time but does not have the same quality formatting.

class AdvancedUserRecipe1280953017(BasicNewsRecipe):
title = u'Democrat & Chronicle'
oldest_article = 1
max_articles_per_feed = 100
no_stylesheets = True
keep_only_tags = [dict(id=['article-bodytext', 'pictopialink'])]

feeds = [
(u'Breaking News', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=RSS&mime=xml'),
(u'News', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=RSS01&mime=xml'),
(u'Business', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=BUSINESS&template=rss_dc&mime=xml '),
(u'Sports', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=SPORTS&template=rss_dc&mime=xml') ,
(u'Living', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=LIVING&template=rss_dc&mime=xml') ,
(u'Health', u'http://www.democratandchronicle.com/apps/pbcs.dll/section?category=HEALTH&template=rss_dc&mime=xml')
]
js4c is offline  
Old 08-07-2010, 12:47 PM   #2399
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
New Recipe: The Skeptic

Development of this recipe involved two interesting features. The first is that the blog where some of the feeds originate is running "Bad Behavior" and it identified Calibre's recipe scraping as a bad boy, producing a 403 error. Playing around with TamperData (and comparing the headers sent by FireFox to those sent by Calibre) showed that Calibre needed to send at least a simple Accept: header to avoid being seen as a spambot. This recipe adds the needed header to the initial GET request.

The second interesting thing in this recipe is that I wanted to remove all tags that started with "follow," such as "followX" or "followY." This recipe uses a regex in the remove_tags.
Attached Files
File Type: zip Skeptic.zip (947 Bytes, 249 views)
Starson17 is offline  
Old 08-07-2010, 02:16 PM   #2400
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by js4c View Post
The problem I am having with the recipe below is the title of the article does not appear in the article itself, only in the calibre generated link.
If the title appears on the article page on the site, then it will appear in the article of the ebook provided you haven't removed it. You have only two tags being retained, so if the title isn't in one of those tags it will be removed.


Quote:
This version runs in about 1/10th the time but does not have the same quality formatting.
Add extra CSS as needed for formatting.

Tip: When posting a recipe here, you should use Code tags to preserve formatting needed for all Python code - and preferably use Spoiler tags, too. Without the formatting it may be impossible to find an error in your code, and it's always harder for anyone to help you, as they have to manually put back in any formatting.
Starson17 is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 10:13 PM.


MobileRead.com is a privately owned, operated and funded community.