Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 04-01-2012, 01:26 PM   #1
watou
Junior Member
watou began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2012
Device: Kindle
Article not added when specifying content string

Hello,
I am writing a new recipe and the newspaper site has the full content of ten articles on one HTML page. I iterate through the page and append each article with an empty url but with full content, but these articles are silently skipped, leaving an empty section in the e-book. Here is the code:

Code:
        for post in ts.findAll('h1'):
            title = self.tag_to_string(post)
            self.log(title)
            url = ''
            date = ''
            content = self.tag_to_string(post.findNextSibling('p'))
            desc = content
            articles.append({'title':title, 'url':url, 'date':date, 'description':desc,
                'content':content})
The documentation for parse_index() says of the content dictionary entry: "The full article (can be an empty string). This is used by FullContentProfile" but I cannot find any documentation on FullContentProfile, or any clue why the content isn't being used. The recipe will be complete after I can fix this issue. Thanks in advance for any insight!
watou is offline   Reply With Quote
Old 04-01-2012, 01:31 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,015
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
content no longer works (it refers to an obsoleteted API). Instead save your html into temporary files and pass a file:///path/to/temp/file.html as the url.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-01-2012, 01:32 PM   #3
watou
Junior Member
watou began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2012
Device: Kindle
Thank you! Will do exactly that.
watou is offline   Reply With Quote
Old 04-01-2012, 03:01 PM   #4
watou
Junior Member
watou began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2012
Device: Kindle
That worked perfectly. Now the final barrier to sharing my new recipe is that one image, whose name has a space in it, is not being retrieved, and is showing a broken image in the resulting e-book. Here is some debug output:

Code:
Processing images...
Fetching http://www.southernstar.ie/scripts/imgsize.php?w=300&img=../images/news/1312c41.jpg
Processing images...
Fetching http://www.southernstar.ie/scripts/imgsize.php?w=300&img=../images/news/Rachel MCCarthya.jpg
Traceback (most recent call last):
  File "site-packages/calibre/web/fetch/simple.py", line 369, in process_images
  File "site-packages/PIL/Image.py", line 1980, in open
IOError: cannot identify image file
Other images load perfectly. The only code that touches images in my recipe is a verbatim copy of atlantic.recipe's postprocess_html().

Is there anything I can do about this?
watou is offline   Reply With Quote
Old 04-01-2012, 11:27 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,015
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
url escape the space
kovidgoyal is offline   Reply With Quote
Advert
Old 04-02-2012, 10:38 AM   #6
watou
Junior Member
watou began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Apr 2012
Device: Kindle
Recipe for The Southern Star

Quote:
Originally Posted by kovidgoyal View Post
url escape the space
Thanks; done; works. Attached please find a completed and working recipe for The Southern Star, a regional weekly newspaper since 1889 from County Cork, Ireland. Tested on Mac and Windows Vista.
Attached Files
File Type: zip southernstar.recipe.zip (2.2 KB, 120 views)
watou is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mathch a string while ignoring some character in that string? ElMiko Sigil 12 12-01-2011 10:05 PM
I hate added content to pbook versions jhempel24 General Discussions 1 09-12-2011 01:57 AM
Tip: Article Date needs to be Unicode String spedinfargo Recipes 0 02-19-2011 07:08 PM
PDF -> MOBI: a string is added to the bottom of each page falconfoxxx Calibre 3 09-14-2010 01:28 AM
Search for files by content string? nekokami iRex 4 12-01-2006 12:14 PM


All times are GMT -4. The time now is 02:33 AM.


MobileRead.com is a privately owned, operated and funded community.