| 
			
			 | 
		#376 | |
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800 
				Karma: 194644 
				Join Date: Dec 2007 
				Location: Argentina 
				
				
				Device: Kindle Voyage 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 You should use this to extract article from that site: Code: 
	    keep_only_tags = [dict(name='div', attrs={'class':'article'})]
    remove_tags_after = dict(name='div',attrs={'class':'articletext'})
 | 
|
| 
		 | 
	
	
| 
			
			 | 
		#377 | 
| 
			
			
			
			 Enthusiast 
			
			![]() Posts: 27 
				Karma: 10 
				Join Date: Mar 2009 
				
				
				
				Device: PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Thats more or less what I used, the Article can also be extraced fine, the problem is the picture within the article. Its a normal JPG picture, but still, it fails to be included. Tried Bookit to get the whole page but it also fails to include the articles picture. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	e.g.: http://diepresse.com/home/panorama/r...ex.do?from=rss Picture of the pope in there, nevertheless, no picture included in the final ebook. Code: 
	remove_tags_before = dict(id='content') remove_tags_after = dict(id='content')  | 
| 
		 | 
	
	
| 
			
			 | 
		#378 | 
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800 
				Karma: 194644 
				Join Date: Dec 2007 
				Location: Argentina 
				
				
				Device: Kindle Voyage 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			You discovered a bug in calibre. For some reason calibre does not fetch the image inside the article. It is just being ignored. Please open bug report in calibre trac and attach your recipe to it so that Kovid can fix this.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
| 
			
			 | 
		#379 | |
| 
			
			
			
			 Member 
			
			![]() Posts: 18 
				Karma: 10 
				Join Date: Feb 2009 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 Typically background images are used as "fluff" on a page are should be assumed as irrelevant furniture and are therefore (rightly IMHO) ignored. If the image conveys meaning it should have used a normal <img... tag with an appropriate alt attribute for accessibility. IMHO it's the fault of diepresse not Calibre. Rufus.  | 
|
| 
		 | 
	
	
| 
			
			 | 
		#380 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 1 
				Karma: 10 
				Join Date: Mar 2009 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Receipt for NZZ - Neue Zuericher Zeitun
			 
			
			
			Hi, 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I just made my first receipt for www.nzz.ch It's doing all I want, but unfortunately it's so slow... ( 56min to produce a 0.3MB ebook ) I started with the BBC receipt to do this, but I don't see, why the NZZ version should be so slow. Here's the receipt: Code: 
	#!/usr/bin/env  python
'''
nzz.ch
'''
from calibre.web.feeds.news import BasicNewsRecipe
class NewNzz(BasicNewsRecipe):
    title          = u'Neue Zuericher Zeitung'
    __author__     = 'NZZ'
    description    = 'Neue Zuericher Zeitung'
    no_stylesheets = True
    language = _('German')
    keep_only_tags = [dict(name='div', attrs={'class':'article'})]
    remove_tags_before = dict(id='article')
    remove_tags_after  = dict(id='article')
    remove_tags     = [dict(attrs={'class':['more', 'nowrap', 'footer', 'teaser', 'articleTools', 'post-tools', 'side_tool', 'nextArticleLink clearfix']}),
                       dict(id=['formSendArticle', 'footer', 'toolsRight', 'articleInline', 'navigation', 'archive', 'side_search', 'blog_sidebar', 'side_tool', 'side_index']),
                       dict(name=['script', 'noscript', 'style'])]
    feeds          = [
                      ('Top Themen', 'http://www.nzz.ch/nachrichten/startseite?rss=true'),
                      ('International', 'http://www.nzz.ch/nachrichten/international?rss=true'),
                      ('Schweiz', 'http://www.nzz.ch/nachrichten/schweiz?rss=true'),
                      ('Wirtschaft', 'http://www.nzz.ch/nachrichten/wirtschaft/aktuell?rss=true'),
                      ('Zuerich', 'http://www.nzz.ch/nachrichten/zuerich?rss=true'),
                      ('Sport', 'http://www.nzz.ch/nachrichten/sport?rss=true'),
					  ('Panorama', 'http://www.nzz.ch/nachrichten/panorama?rss=true'),          
                    ]
    def print_version(self, url):
        return url+'?printview=true'
Best regards keckx  | 
| 
		 | 
	
	
| 
			
			 | 
		#381 | 
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800 
				Karma: 194644 
				Join Date: Dec 2007 
				Location: Argentina 
				
				
				Device: Kindle Voyage 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			NZZ online server is quite slow. There is nothing you can do about that. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Just some notes about the recipe: This: Code: 
	    remove_tags_before = dict(id='article')
    remove_tags_after  = dict(id='article')
Code: 
	    keep_only_tags = [dict(name='div', attrs={'id':'article'})]
 | 
| 
		 | 
	
	
| 
			
			 | 
		#382 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 7 
				Karma: 10 
				Join Date: Mar 2009 
				
				
				
				Device: PRS 505 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			hello everybody... 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I have a problem with some german calibre recipes and the epub-output. The recipes for Spiegel Online and FAZ NET are not working correctly and I have no idea why... Spiegel Online gives only about eight pages with the overview on the articles and the FAZ NET-ebook leads to a freezing of my sony prs505... ![]() Does anybody have an idea why this happens or does anybody have the same problems? I stopped using the LRF-Output where those recipes worked well because of the bug that causes the reader to reset. The workaround described in the FAQ (download RSS-feed using calibre and transfering via the Sony software) is not acceptible because I loose the comfort of just upping some news to my reader in the sleepy morning... Perhaps somebody knows a solution for my problem? Thanks ! AngeloT  | 
| 
		 | 
	
	
| 
			
			 | 
		#383 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Spiegel will be fixed in the next release. As for FAZ I dont see anything abviously wrong with the EPUB file, so bug SONY/Adobe to fix their software, the EPUB file certainly works correctly on the desktop
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
| 
			
			 | 
		#384 | |
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 4 
				Karma: 10 
				Join Date: Nov 2007 
				
				
				
				Device: sony reader 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I have been using Calibre for some time and made some of my own recipes. Over a long period I have been using Calibre 0.4.67. and those recipes were working on this Calibre version. I have tried few next Calibre versions but those custom recipes were not working on those trials. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Now I have 0.5.2. Calibre version installed and those recipes do not work either. The following Conversion Error shows up: Quote: 
	
 The code of each recipe is a little bit different and it generally presents itself in this fashion: Code: 
	from libprs500.ebooks.lrf.web.profiles import DefaultProfile 
import re 
class DiePresseWirtschaft(DefaultProfile): 
    title = 'DiePresseWirtschaft' 
    timefmt = ' [%d %b %Y]' 
    summary_length = 1000
    oldest_article = 1
    max_articles_per_feed = 100
    max_recursions = 2 
    html_description = True 
    no_stylesheets = True 
    def get_feeds(self):  
        return [ ('Die Presse Wirtschaft', 'http://www.diepresse.com/rss/Wirtschaft') ]  
    def print_version(self,url): 
        return url.replace('index.do?from=rss', 'print.do') 
    preprocess_regexps = [
        (re.compile(r'<script>.*?</script>', re.IGNORECASE | re.DOTALL), lambda match : ''),
        (re.compile(r'<H4>.*?</H4>', re.IGNORECASE | re.DOTALL), lambda match : ''),
        ]
 | 
|
| 
		 | 
	
	
| 
			
			 | 
		#385 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	|
| 
		
 | 
	
	
| 
			
			 | 
		#386 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 7 
				Karma: 10 
				Join Date: Sep 2008 
				
				
				
				Device: Sony PRS-505 
				
				
				 | 
	
	
	
		
		
			
			 
			
			Hi 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I'm using the google reader recipe, but kovidgoyal (who is usually right!) thinks this only downloads starred messages. Is there a way to get it to load all unread messages, or do I have to manually create a recipe file for each of my 40 feeds? TIA Shaun  | 
| 
		 | 
	
	
| 
			
			 | 
		#387 | |
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 7 
				Karma: 10 
				Join Date: Mar 2009 
				
				
				
				Device: PRS 505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 ![]() I don't know what's wrong with the FAZ Epub - looks fine on the desktop but it has very large pages with small letters and is also very broad. I think this is the reason why it freezes my Sony reader, which is perhaps not able to format these pages right.  | 
|
| 
		 | 
	
	
| 
			
			 | 
		#388 | 
| 
			
			
			
			 Guru 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800 
				Karma: 194644 
				Join Date: Dec 2007 
				Location: Argentina 
				
				
				Device: Kindle Voyage 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			FAZ recipe was not cleaning all of the styles and that made content hard to read. Here is vastly updated recipe that produces correct epub. 
		
	
		
		
			@Kovid Please update this with your upcoming release. FAZ updated recipe:  | 
| 
		 | 
	
	
| 
			
			 | 
		#389 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Thanks, updated.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
| 
			
			 | 
		#390 | 
| 
			
			
			
			 Hyperreader 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 130 
				Karma: 28678 
				Join Date: Feb 2009 
				
				
				
				Device: Current: Boox Leaf2 (broken) Past: H2O, Kindle PW1, DXG;Pocketbook 360 
				
				
				 | 
	
	
	
		
		
			
			 
				
				FanFiction.net
			 
			
			
			I tried to make a recipe for FanFiction.net.  Well, unless it is a multiple chapters story, it is easy enough. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	class FanFiction(BasicNewsRecipe):
    title          = u'FanFiction'
    oldest_article = 7
    max_articles_per_feed = 10
    use_embedded_content  = False
    remove_javascript     = True
    keep_only_tags     = [dict(name='div', attrs={'id':'storytext'})]
    
    feeds          = [(u'Just In', u'http://www.fanfiction.net/atom/j/0/0/0/')]
 
		 | 
| 
		 | 
	
	
![]()  | 
            
        
            
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 03:52 AM | 
| Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 01:33 PM | 
| How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 06:08 AM | 
| Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 05:54 PM | 
| Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 03:37 PM |