View Single Post
Old 06-28-2010, 05:01 PM   #2211
nook.life
Member
nook.life began at the beginning.
 
Posts: 12
Karma: 10
Join Date: May 2010
Device: Nook
Quote:
Originally Posted by Starson17 View Post
I was probably in a grumpy mood that day

Whatever I posted, it wasn't the final recipe, as what you were working with still had lots of junk in it. This is closer to the final I came up with, but my earlier version had some text that identified the comic. You want it rotated, so I removed the text above to give more room for the comic.

Try this:
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
import re
import calibre.utils.PythonMagickWand as pw
import calibre.utils.PythonMagickWand

class Explosm(BasicNewsRecipe):
    title               = 'Explosm'
    __author__          = 'Starson17'
    description         = 'Explosm'
    language            = 'en'
    use_embedded_content= False
    no_stylesheets      = True
    oldest_article      = 24
    remove_javascript   = True
    remove_empty_feeds    = True
    max_articles_per_feed = 10

    feeds = [
             (u'Explosm Feed', u'http://feeds.feedburner.com/Explosm')
             ]

    keep_only_tags     = [dict(name='div', attrs={'align':'center'})]
    remove_tags = [dict(name='span'),
                   dict(name='table')]

    def postprocess_html(self, soup, first):
        #process all the images. assumes that the new html has the correct path
        for tag in soup.findAll(lambda tag: tag.name.lower()=='img' and tag.has_key('src')):
            iurl = tag['src']
            print 'resizing image' + iurl
            with pw.ImageMagick():
                img = pw.NewMagickWand()
                p = pw.NewPixelWand()
                if img < 0:
                    raise RuntimeError('Out of memory')
                if not pw.MagickReadImage(img, iurl):
                    severity = pw.ExceptionType(0)
                    msg = pw.MagickGetException(img, byref(severity))
                    raise IOError('Failed to read image from: %s: %s'
                        %(iurl, msg))
                width = pw.MagickGetImageWidth(img)
                height = pw.MagickGetImageHeight(img)
                if( width > height ) :
                    print 'Rotate image'
                    pw.MagickRotateImage(img, p, 90)
                if not pw.MagickWriteImage(img, iurl):
                    raise RuntimeError('Failed to save image to %s'%iurl)
                pw.DestroyMagickWand(img)
        return soup

    extra_css = '''
                    h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
                    h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
                    p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
                    body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
		'''
Hey Starson, thanks so much for posting the code. It definitely looks a lot cleaner without the text above and makes it more readable. Unfortunately, it is still clipping it at the top for some of the comics. I looked at the html test version and it still seems to be outputting some sort of table (you can see the outline) even though you removed it. Starting to wonder if my output settings are messed up in calibre. I uninstalled and reinstalled it, but it seems like it keeps the settings (my custom recipes and scheduled were still there after the new install)

here is what it looks like
http://picturepush.com/public/3708883
nook.life is offline