Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 06-19-2010, 12:58 PM   #2161
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by capidamonte View Post
Sadly, the Honolulu Advertiser was bought by the Honolulu Star-Bulletin. The new paper is called the Star Advertiser at www.staradvertiser.com.

Any chance of a new recipe? And the old ones are no longer valid...
Here it is: recipe for the Star Advertiser.
Attached Files
File Type: zip StarAdvertiser.zip (586 Bytes, 214 views)
rty is offline  
Old 06-19-2010, 05:42 PM   #2162
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
Quote:
Originally Posted by rty View Post
Here it is: recipe for the Star Advertiser.
Hey! Now there are two!

The most recent Calibre had a new one -- I'll have to try yours as well.

Thanks!

cap
capidamonte is offline  
Advert
Old 06-19-2010, 08:16 PM   #2163
kidtwisted
Member
kidtwisted began at the beginning.
 
kidtwisted's Avatar
 
Posts: 16
Karma: 10
Join Date: May 2010
Location: Southern California
Device: JetBook-Lite
Hey Starson17, help!

Quote:
Originally Posted by Starson17 View Post
Aren't recipes fun!



It's likely because of the order in which the various stages of the recipe are processed. I've certainly seen this. Once you get to the point where you are building your own pages from the soup (and that's what the multipage does) you don't get the expected behavior.

I believe the keep_only throws away the tags, during the initial page pull, but doesn't apply to the extra pages you are getting with the soup2 = self.index_to_soup(nexturl) step.

I've certainly seen this before. There are lots of solutions, in fact, your recipe already uses one - extract()- to remove a tag. Just find the tags and extract them.

I usually do this at the postprocess_html stage with something like this:
Code:
        for tag in soup.findAll('form', dict(attrs={'name':["comments_form"]})):
            tag.extract()
        for tag in soup.findAll('font', dict(attrs={'id':["cr-other-headlines"]})):
            tag.extract()
extract() removes the tag entirely from the original soup, leaving you with two independent soups. In your recipe, you want the extracted tag, but it also works to remove it from the original soup, just like remove_tags.
I've been having trouble making this work, adding this to the end of the recipe just breaks it. Can I get a more detailed example, I did read something about first_fetch but not sure how to use it. Is there another recipe I could look at for example?

Code:
    def postprocess_html(self, soup):
        for tag in soup.findAll('dic', dict(attrs={'class':["article-info clearfix"]})):
            tag.extract()
        return soup
kidtwisted is offline  
Old 06-19-2010, 09:17 PM   #2164
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
Quote:
Originally Posted by rty View Post
Here it is: the recipe for Forbes India.
the recipe you have made does not cover whole magazine,it just fetches few of the feeds.Please do have a look at this link
http://business.in.com/magazine/magazinearchive/1/1
the recipe should download the latest issue from this link ,like in this case it should have fetched the 18 june issue.
bhandarisaurabh is offline  
Old 06-20-2010, 01:01 AM   #2165
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by bhandarisaurabh View Post
the recipe you have made does not cover whole magazine,it just fetches few of the feeds.Please do have a look at this link
http://business.in.com/magazine/magazinearchive/1/1
the recipe should download the latest issue from this link ,like in this case it should have fetched the 18 june issue.
The recipe picks up articles from this link called "Forbes India All Feed"
(http://business.in.com/rssfeed/rss_all.xml) that supposedly contains all articles of the latest issue. The fact that the webmaster of Forbes India doesn't update the RSS feed to link all articles (as its name suggests) is beyond my comprehension and control but the recipe does point to the latest issue (whatever the date is). Yes, I went to the link you gave above. Please take a look again at the top right hand side of that webpage and find the column that says "Latest Issue" and let me know what you see.

Spoiler:


Click on the Show button above to see what I saw. You said 18 june is the latest issue but look at what the website said. It's 02 July.

Last edited by rty; 06-20-2010 at 01:48 AM.
rty is offline  
Advert
Old 06-20-2010, 07:10 AM   #2166
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kidtwisted View Post
Hey Starson17, help!
I'll take a shot at it

Quote:
I've been having trouble making this work, adding this to the end of the recipe just breaks it.
Does "this" refer to the code below? If so, try this:

Code:
    def postprocess_html(self, soup):
        for tag in soup.findAll('dic', dict(attrs={'class':["article-info clearfix"]})):
            #tag.extract()
            print 'The tag to be extracted is: ', tag
        return soup
If it's breaking because you're extracting something, then you probably shouldn't be extracting it - see what you're extracting with the print code above.

Quote:
Can I get a more detailed example, I did read something about first_fetch but not sure how to use it. Is there another recipe I could look at for example?
The entirety of relevant code is in your example. You find the tag in the soup and extract it. I'm not sure what else to point you to.
Starson17 is offline  
Old 06-20-2010, 09:28 AM   #2167
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by Starson17 View Post
If it's breaking because you're extracting something, then you probably shouldn't be extracting it - see what you're extracting with the print code above.
Sorry to ask this noob question Starson. Where would you find the output of that print command?

I found out the hard way when I first used this tag.extract that "extract" under this context actually means "discard" or "dispose" or "get rid of".
rty is offline  
Old 06-20-2010, 09:52 AM   #2168
robandcurtis
Junior Member
robandcurtis began at the beginning.
 
Posts: 5
Karma: 12
Join Date: Jun 2010
Device: Kobo
London Free Press

Can I get a custom recipe for the London Free Press at
http://www.lfpress.com/

Thanks!
robandcurtis is offline  
Old 06-20-2010, 10:16 AM   #2169
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by rty View Post
Sorry to ask this noob question Starson. Where would you find the output of that print command?
I always use ebook-convert recipename.recipe foldername --test -vv>recipename.txt during testing and it's in the .txt file. Alternatively, start the GUI with calibre-debug -g and it will appear there.

Quote:
I found out the hard way when I first used this tag.extract that "extract" under this context actually means "discard" or "dispose" or "get rid of".
Not really. It means split it away from and make it independent of the soup. You end up with two pieces of soup. If you then don't use it, it's gone, but you are free to put it back someplace else. It makes sure that if you extract something, all traces of it are gone from the original soup, and all traces of the original soup are gone from it.

You were trying to do cleaning - extract is another way to do that.

Last edited by Starson17; 06-20-2010 at 10:20 AM.
Starson17 is offline  
Old 06-20-2010, 11:12 AM   #2170
mlstein
Enthusiast
mlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it ismlstein knows what time it is
 
Posts: 49
Karma: 2062
Join Date: May 2010
Device: iPad (one)
London Review of Books

Can one of you generous and talented people revise the LRB recipe? I'm a subscriber and get all the web content, but the recipe gives me only a letter or two. Thanks!

Michael Steinberg

PS. calibre is da bomb!
mlstein is offline  
Old 06-20-2010, 01:44 PM   #2171
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by robandcurtis View Post
Can I get a custom recipe for the London Free Press at
http://www.lfpress.com/

Thanks!
I'll see what I can do this coming weekend. No promise though.
rty is offline  
Old 06-20-2010, 03:59 PM   #2172
robandcurtis
Junior Member
robandcurtis began at the beginning.
 
Posts: 5
Karma: 12
Join Date: Jun 2010
Device: Kobo
Quote:
Originally Posted by rty View Post
I'll see what I can do this coming weekend. No promise though.
Thank you! No rush
robandcurtis is offline  
Old 06-20-2010, 07:32 PM   #2173
bhandarisaurabh
Enthusiast
bhandarisaurabh began at the beginning.
 
Posts: 49
Karma: 10
Join Date: Aug 2009
Device: none
Quote:
Originally Posted by rty View Post
The recipe picks up articles from this link called "Forbes India All Feed"
(http://business.in.com/rssfeed/rss_all.xml) that supposedly contains all articles of the latest issue. The fact that the webmaster of Forbes India doesn't update the RSS feed to link all articles (as its name suggests) is beyond my comprehension and control but the recipe does point to the latest issue (whatever the date is). Yes, I went to the link you gave above. Please take a look again at the top right hand side of that webpage and find the column that says "Latest Issue" and let me know what you see.

Spoiler:


Click on the Show button above to see what I saw. You said 18 june is the latest issue but look at what the website said. It's 02 July.

but if you go to the latest issue most of the links are not activated ,if you try to click the links the link says that the article will be available on 1st of july.Actually they are just giving you a fair idea of what is going to be in the latest issue beforehand but the latest issue is of18 june.
bhandarisaurabh is offline  
Old 06-21-2010, 01:09 AM   #2174
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by bhandarisaurabh View Post
but if you go to the latest issue most of the links are not activated ,if you try to click the links the link says that the article will be available on 1st of july.Actually they are just giving you a fair idea of what is going to be in the latest issue beforehand but the latest issue is of18 june.
Look at the RSS page provided by Forbes India: http://business.in.com/rss/

As I mentioned, the recipe picks up articles from the feed called "Complete Business.in.com" http://business.in.com/rssfeed/rss_all.xml

Anything that is not included by Forbes India in this particular feed, there's nothing I can do about it. Maybe you can write to Forbes India to ask them to include all the articles of the latest issue in the RSS feed page and see if they care.

Last edited by rty; 06-21-2010 at 01:12 AM.
rty is offline  
Old 06-21-2010, 10:00 AM   #2175
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by robandcurtis View Post
Can I get a custom recipe for the London Free Press at
http://www.lfpress.com/

Thanks!
Here it is. Recipe for London Free Press (Canada).
Attached Files
File Type: zip LondonFreePress(Canada).zip (693 Bytes, 226 views)
rty is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 02:08 AM.


MobileRead.com is a privately owned, operated and funded community.