| 
			
			 | 
		#1 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
			
			 
			
			Hi, guys 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I've written a recipe (inherited from BasicNewsRecipe) to fetch some articles online, but when I converted my recipe to ebooks, I only got titles and links and no contents at all. After searching for a while, it seems that I should define user_agent in "get_browser". This has partly solved the problem. But still, some articles are still empty. Any ideas? Thank you!  
		 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			are you using auto_cleanup? If so try turning it off.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#3 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			No, I didn't use auto_cleanup. Here is my testing custome recipe in the attachment.  
		
	
		
		
			You'll be asked to input an article link. Please use this article link: http://www.theworldin.com/edition/20...endulum-swings. And you may get an empty article. But it seems links from other sites can do.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			The server you are contacting is failing, probably ecause it needs some cookies set or something similar. Add this to your recipe to check: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	    def preprocess_raw_html(self, html, url):
        with open('/t/raw.html', 'wb') as f:
            f.write(html.encode('utf-8'))
        return html
 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | |
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
			
			 
			
			I got this raw html: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#6 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			only the person running the server can tell you that.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			most liekly it is using javascript to load content
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#8 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			If so, is there no way to work around this?
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#9 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			no easy way. you would basically need to figure out what requests the javascript is making to load the actual content and make those requests manually in the recipe.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#10 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Can calibre support Selenium to fetch web pages so that I can work around js?
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#11 | 
| 
			
			
			
			 Connoisseur 
			
			![]() Posts: 51 
				Karma: 10 
				Join Date: Oct 2018 
				
				
				
				Device: kindle 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I think it would be great if get_browser supports selenium. Is this possible?
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#12 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			no, I'm afraid not.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		
 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
    
| Tags | 
| empty page, recipe | 
| Thread Tools | Search this Thread | 
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| All pages empty after converting epub in Calibre | Apostrophe | Conversion | 1 | 01-29-2015 11:08 AM | 
| Previously downloaded articles & empty editions | paipa | Recipes | 2 | 11-03-2013 02:20 PM | 
| Financial Times recipe downloading slowly, empty pages | mapex | Recipes | 34 | 06-06-2013 07:27 AM | 
| InDesign to Epub (empty pages) | PauloCoe | EPUBReader | 1 | 06-22-2011 09:56 AM | 
| Reversing articles order in a custom news recipe? | retired_anon_25 | Calibre | 5 | 12-12-2009 06:24 PM |