Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-19-2018, 10:21 PM   #1
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
Question Blank pages (empty articles) in custom recipe

Hi, guys
I've written a recipe (inherited from BasicNewsRecipe) to fetch some articles online, but when I converted my recipe to ebooks, I only got titles and links and no contents at all. After searching for a while, it seems that I should define user_agent in "get_browser". This has partly solved the problem. But still, some articles are still empty. Any ideas?
Thank you!
xiatian is offline   Reply With Quote
Old 10-20-2018, 12:06 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
are you using auto_cleanup? If so try turning it off.
kovidgoyal is offline   Reply With Quote
Advert
Old 10-20-2018, 10:41 AM   #3
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
No, I didn't use auto_cleanup. Here is my testing custome recipe in the attachment.
You'll be asked to input an article link. Please use this article link: http://www.theworldin.com/edition/20...endulum-swings. And you may get an empty article. But it seems links from other sites can do.
Attached Files
File Type: recipe custom.recipe (714 Bytes, 193 views)
xiatian is offline   Reply With Quote
Old 10-20-2018, 09:36 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The server you are contacting is failing, probably ecause it needs some cookies set or something similar. Add this to your recipe to check:

Code:
    def preprocess_raw_html(self, html, url):
        with open('/t/raw.html', 'wb') as f:
            f.write(html.encode('utf-8'))
        return html
change the '/t/raw.html' above to some path on your computer and open the resulting raw.html after the download to see what actual html the servr is sending.
kovidgoyal is offline   Reply With Quote
Old 10-21-2018, 01:30 AM   #5
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
Question

I got this raw html:
Quote:
<html>
<head>
<META NAME="robots" CONTENT="noindex,nofollow">
<script src="/_Incapsula_Resource?SWJIYLWA=5074a744e2e3d891814e9 a2dace20bd4,719d34d31c8e3a6e6fffd425f7e032f3">
</script>
<body>
</body></html>
What happened?
xiatian is offline   Reply With Quote
Advert
Old 10-21-2018, 01:32 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
only the person running the server can tell you that.
kovidgoyal is offline   Reply With Quote
Old 10-21-2018, 01:33 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
most liekly it is using javascript to load content
kovidgoyal is offline   Reply With Quote
Old 10-21-2018, 01:37 AM   #8
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
If so, is there no way to work around this?
xiatian is offline   Reply With Quote
Old 10-21-2018, 01:37 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
no easy way. you would basically need to figure out what requests the javascript is making to load the actual content and make those requests manually in the recipe.
kovidgoyal is offline   Reply With Quote
Old 10-21-2018, 01:48 AM   #10
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
Can calibre support Selenium to fetch web pages so that I can work around js?
xiatian is offline   Reply With Quote
Old 10-21-2018, 02:22 AM   #11
xiatian
Connoisseur
xiatian began at the beginning.
 
Posts: 50
Karma: 10
Join Date: Oct 2018
Device: kindle
I think it would be great if get_browser supports selenium. Is this possible?
xiatian is offline   Reply With Quote
Old 10-21-2018, 02:22 AM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,851
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
no, I'm afraid not.
kovidgoyal is offline   Reply With Quote
Reply

Tags
empty page, recipe


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
All pages empty after converting epub in Calibre Apostrophe Conversion 1 01-29-2015 10:08 AM
Previously downloaded articles & empty editions paipa Recipes 2 11-03-2013 01:20 PM
Financial Times recipe downloading slowly, empty pages mapex Recipes 34 06-06-2013 06:27 AM
InDesign to Epub (empty pages) PauloCoe EPUBReader 1 06-22-2011 08:56 AM
Reversing articles order in a custom news recipe? mairabc Calibre 5 12-12-2009 05:24 PM


All times are GMT -4. The time now is 11:05 PM.


MobileRead.com is a privately owned, operated and funded community.