|  11-25-2010, 11:43 AM | #1 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | 
				
				New Recipe:Arcamax - Comics
			 
			
			This is another comics recipe.  As in gocomics.com and comics.com, you can set the number of days to retrieve and you should customize to set the strips you want or don't want. The only interesting thing in this recipe is that I wanted to set 100% max/min width on the main comic img, but I didn't want it to apply to the other img tags. I used preprocesss_html to set an id only on the main comic img tag and extra_css to control it. It's ready to add to built-ins. It's a family-friendly site (might make it easier to find/identify family-friendly comics) and has some comics not found in other sites. Code: #!/usr/bin/env  python
__license__   = 'GPL v3'
__copyright__ = 'Copyright 2010 Starson17'
'''
www.arcamax.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
#from calibre.ebooks.BeautifulSoup import BeautifulSoup
import mechanize, re
class Arcamax(BasicNewsRecipe):
    title               = 'Arcamax'
    __author__          = 'Starson17'
    __version__         = '1.03'
    __date__            = '25 November 2010'
    description         = u'Family Friendly Comics - Customize for more days/comics: Defaults to 7 days, 25 comics - 20 general, 5 editorial.'
    category            = 'news, comics'
    language            = 'en'
    use_embedded_content= False
    no_stylesheets      = True
    remove_javascript   = True
    cover_url           = 'http://www.arcamax.com/images/pub/amuse/leftcol/zits.jpg'
    ####### USER PREFERENCES - SET COMICS AND NUMBER OF COMICS TO RETRIEVE ########
    num_comics_to_get = 7
    # CHOOSE COMIC STRIPS BELOW - REMOVE COMMENT '# ' FROM IN FRONT OF DESIRED STRIPS
    conversion_options = {'linearize_tables'  : True
                        , 'comment'           : description
                        , 'tags'              : category
                        , 'language'          : language
                        }
    keep_only_tags     = [dict(name='div', attrs={'class':['toon']}),
                          ]
   
    def parse_index(self):
        feeds = []
        for title, url in [
                            ######## COMICS - GENERAL ######## 
                            #(u"9 Chickweed Lane", u"http://www.arcamax.com/ninechickweedlane"),
                            #(u"Agnes", u"http://www.arcamax.com/agnes"),
                            #(u"Andy Capp", u"http://www.arcamax.com/andycapp"),
                            (u"BC", u"http://www.arcamax.com/bc"),
                            #(u"Baby Blues", u"http://www.arcamax.com/babyblues"),
                            #(u"Beetle Bailey", u"http://www.arcamax.com/beetlebailey"),
                            (u"Blondie", u"http://www.arcamax.com/blondie"),
                            #u"Boondocks", u"http://www.arcamax.com/boondocks"),
                            #(u"Cathy", u"http://www.arcamax.com/cathy"),
                            #(u"Daddys Home", u"http://www.arcamax.com/daddyshome"),
                            (u"Dilbert", u"http://www.arcamax.com/dilbert"),
                            #(u"Dinette Set", u"http://www.arcamax.com/thedinetteset"),
                            (u"Dog Eat Doug", u"http://www.arcamax.com/dogeatdoug"),
                            (u"Doonesbury", u"http://www.arcamax.com/doonesbury"),
                            #(u"Dustin", u"http://www.arcamax.com/dustin"),
                            (u"Family Circus", u"http://www.arcamax.com/familycircus"),
                            (u"Garfield", u"http://www.arcamax.com/garfield"),
                            #(u"Get Fuzzy", u"http://www.arcamax.com/getfuzzy"),
                            #(u"Girls and Sports", u"http://www.arcamax.com/girlsandsports"),
                            #(u"Hagar the Horrible", u"http://www.arcamax.com/hagarthehorrible"),
                            #(u"Heathcliff", u"http://www.arcamax.com/heathcliff"),
                            #(u"Jerry King Cartoons", u"http://www.arcamax.com/humorcartoon"),
                            #(u"Luann", u"http://www.arcamax.com/luann"),
                            #(u"Momma", u"http://www.arcamax.com/momma"),
                            #(u"Mother Goose and Grimm", u"http://www.arcamax.com/mothergooseandgrimm"),
                            (u"Mutts", u"http://www.arcamax.com/mutts"),
                            #(u"Non Sequitur", u"http://www.arcamax.com/nonsequitur"),
                            #(u"Pearls Before Swine", u"http://www.arcamax.com/pearlsbeforeswine"),
                            #(u"Pickles", u"http://www.arcamax.com/pickles"),
                            #(u"Red and Rover", u"http://www.arcamax.com/redandrover"),
                            #(u"Rubes", u"http://www.arcamax.com/rubes"),
                            #(u"Rugrats", u"http://www.arcamax.com/rugrats"),
                            (u"Speed Bump", u"http://www.arcamax.com/speedbump"),
                            (u"Wizard of Id", u"http://www.arcamax.com/wizardofid"),
                            (u"Dilbert", u"http://www.arcamax.com/dilbert"),
                            (u"Zits", u"http://www.arcamax.com/zits"),
                             ]:
            articles = self.make_links(url)
            if articles:
                feeds.append((title, articles))
        return feeds
    def make_links(self, url):
        title = 'Temp'
        current_articles = []
        pages = range(1, self.num_comics_to_get+1)
        for page in pages:
            page_soup = self.index_to_soup(url)
            if page_soup:
                title = page_soup.find(name='div', attrs={'class':'toon'}).p.img['alt']
                page_url = url
                prev_page_url = 'http://www.arcamax.com' + page_soup.find('a', attrs={'class':'next'}, text='Previous').parent['href']
            current_articles.append({'title': title, 'url': page_url, 'description':'', 'date':''})
            url = prev_page_url
        current_articles.reverse()
        return current_articles
    def preprocess_html(self, soup):
        main_comic = soup.find('p',attrs={'class':'m0'})
        if main_comic.a['target'] == '_blank':
            main_comic.a.img['id'] = 'main_comic'
        return soup
    extra_css = '''
                    h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
                    h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
                    img#main_comic {max-width:100%; min-width:100%;}
                    p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
                    body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
		''' | 
|   |   | 
|  11-26-2010, 11:55 PM | #2 | 
| Enthusiast  Posts: 28 Karma: 10 Join Date: Nov 2010 Device: Samsung Android using FBreader | 
			
			That works great!  Thank you so much, your rule! BJ | 
|   |   | 
|  04-17-2011, 10:32 PM | #3 | 
| Member  Posts: 19 Karma: 10 Join Date: Feb 2011 Device: kindle 3 | 
			
			I'm working on updating the recipe... seemed to have started failing 4/13/2011.   If I get it, i'll post it... if someone else gets to it before me, please post changes... thanks, -tim | 
|   |   | 
|  04-17-2011, 11:52 PM | #4 | 
| Member  Posts: 19 Karma: 10 Join Date: Feb 2011 Device: kindle 3 | 
				
				Need help
			 
			
			no such luck... here's output; 1% Converting input to HTML... InputFormatPlugin: Recipe Input running 1% Fetching feeds... Python function terminated unexpectedly 'NoneType' object has no attribute 'decode' (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\ebooks\conversion\cli.py", line 282, in main File "site-packages\calibre\ebooks\conversion\plumber.py", line 915, in run File "site-packages\calibre\customize\conversion.py", line 204, in __call__ File "site-packages\calibre\web\feeds\input.py", line 105, in convert File "site-packages\calibre\web\feeds\news.py", line 735, in download File "site-packages\calibre\web\feeds\news.py", line 874, in build_index File "site-packages\calibre\web\feeds\__init__.py", line 338, in feeds_from_index File "site-packages\calibre\web\feeds\__init__.py", line 165, in populate_from_preparsed_feed File "site-packages\calibre\web\feeds\__init__.py", line 30, in __init__ AttributeError: 'NoneType' object has no attribute 'decode'----------------------------------------------- | 
|   |   | 
|  04-18-2011, 10:30 AM | #5 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | 
			
			Kovid: The site has changed significantly. Here's a completely rewrittten Arcamax recipe: Spoiler: 
 | 
|   |   | 
|  04-18-2011, 10:46 AM | #6 | 
| creator of calibre            Posts: 45,600 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			updated
		 | 
|   |   | 
|  04-18-2011, 10:53 AM | #7 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | |
|   |   | 
|  04-18-2011, 10:54 AM | #8 | 
| creator of calibre            Posts: 45,600 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			fixed.
		 | 
|   |   | 
|  04-19-2011, 11:38 AM | #9 | 
| Member  Posts: 19 Karma: 10 Join Date: Feb 2011 Device: kindle 3 | 
			
			my dilbert addiction thanks you...
		 | 
|   |   | 
|  04-19-2011, 11:49 AM | #10 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | |
|   |   | 
|  04-25-2011, 10:24 AM | #11 | 
| Enthusiast  Posts: 28 Karma: 10 Join Date: Nov 2010 Device: Samsung Android using FBreader | 
			
			Thank you!
		 | 
|   |   | 
|  04-25-2011, 03:12 PM | #12 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | 
			
			You're welcome.   A bit of a funny: I was getting errors reported on this when it ran overnight. I thought I must have made a mistake when I rewrote it. Each time I went to manually fix/check it, however, it ran correctly.  It turned out I had an earlier custom recipe with the same name - Arcamax - that was running overnight. I was trying to "fix" the builtin copy when it was the old custom that was broken.   | 
|   |   | 
|  05-10-2011, 06:27 PM | #13 | 
| Member  Posts: 19 Karma: 10 Join Date: Feb 2011 Device: kindle 3 | 
			
			Is it just me or did they change the site _again_?  looks like it stopped working on 5/5/11...
		 | 
|   |   | 
|  05-11-2011, 07:56 AM | #14 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | |
|   |   | 
|  05-11-2011, 09:52 PM | #15 | 
| Wizard            Posts: 4,004 Karma: 177841 Join Date: Dec 2009 Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T | |
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Recipe works when mocked up as Python file, fails when converted to Recipe | ode | Recipes | 7 | 09-04-2011 04:57 AM | 
| Updated New Yorker recipe doesn't fetch comics | yekim54 | Recipes | 2 | 10-09-2010 10:47 PM | 
| Comics | cancelx | Astak EZReader | 8 | 05-04-2010 01:22 PM | 
| Comics? | Drewmangroup | Sony Reader | 14 | 03-03-2009 01:05 PM |