![]() |
#1576 | ||
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
Quote:
Quote:
![]() |
||
![]() |
![]() |
#1577 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
Advert | |
|
![]() |
#1578 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Mar 2010
Device: Kindle 2
|
I'd love to see a recipe for GoComics (specifically, all Calvin and Hobbes strips), if possible. Thanks in advance!
|
![]() |
![]() |
#1579 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,407
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
![]() |
![]() |
#1580 |
Member
![]() Posts: 15
Karma: 10
Join Date: Mar 2010
Device: PW2, K3gb(x2), K3w, K4, k5(x3) PRS-505s, Stanza for ipod
|
I'm trying to make my own recipe for the sun.co.uk but I'm having a few probs
I'm using just one feed at the minute to speed up downloading while testing and I'm trying to fetch just the main article. The custom recipe I've come up with is Code:
class AdvancedUserRecipe1268409464(BasicNewsRecipe): title = u'The Sun' oldest_article = 3 max_articles_per_feed = 100 no_stylesheets = True extra_css = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt }' keep_only_tags = [ dict(name='div', attrs={'class':'medium-centered'}) ,dict(name='div', attrs={'class':'article'}) ] remove_tags = [dict(name='div', attrs={'class':'slideshow'})] feeds = [(u'News', u'http://www.thesun.co.uk/sol/homepage/feeds/rss/article312900.ece')] def print_version(self, url): return url.replace('?OTC-RSS&ATTR=News', '?print=yes') def print_version(self, url): return url.replace('?OTC-RSS&ATTR=Royals', '?print=yes') def print_version(self, url): return url.replace('?OTC-RSS&ATTR=Gizmo', '?print=yes') But this is not fetching anything. Can anyone give me some pointers please? ![]() |
![]() |
Advert | |
|
![]() |
#1581 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
However, the recipe still pulls images that have a small GoComics logo on the border. That logo does not appear when the strips are viewed directly. I've figured out why, but I haven't quite fixed the recipe to get the clean images. I'll release it (when I'm done tweaking) as a companion to my earlier Comics.com recipe. |
|
![]() |
![]() |
#1582 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Mar 2010
Device: Kindle 2
|
Starson, thank you, that would be great!
|
![]() |
![]() |
#1583 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Peter schiff's economic Commentary:
|
![]() |
![]() |
#1584 |
Zealot
![]() ![]() ![]() ![]() ![]() Posts: 135
Karma: 488
Join Date: Mar 2010
Location: Tulsa, OK, USA
Device: Kindle 2, Sony PRS 900
|
|
![]() |
![]() |
#1586 |
Enthusiast
![]() Posts: 38
Karma: 10
Join Date: Nov 2009
Location: Poland
Device: kindle 1st gen, kindle dxg, kindle paperwhite2
|
http://github.com/t3d/kalibrator/raw/master/runa.recipe
http://www.runa.pl/favicon.ico Maybe it would be easier to have a filed for icons in BasicNewsRecipe class? |
![]() |
![]() |
#1587 |
Enthusiast
![]() Posts: 38
Karma: 10
Join Date: Nov 2009
Location: Poland
Device: kindle 1st gen, kindle dxg, kindle paperwhite2
|
dual posting ;/
Last edited by t3d; 03-13-2010 at 11:44 AM. |
![]() |
![]() |
#1588 |
Member
![]() Posts: 15
Karma: 10
Join Date: Mar 2010
Device: PW2, K3gb(x2), K3w, K4, k5(x3) PRS-505s, Stanza for ipod
|
![]() I'm still trying to stumble my way through a custom script to fetch the news from the sun website, I've sorted it so it changes the web page into the print page and leaves out the slideshows but it fetches all sorts of rubbish after the story including all the 'connect to us' stuff How would I just get the headline and the main body of the article? Here's the source code from a basic printpage from the sun.co.uk Code:
<script language="JavaScript" type="text/javascript"> <!-- var s_account="newsintthesunprod,newsintsunnetworkprod,newsintniglobalprod"; //--> </script> <script type="text/javascript" src="/js/s_codeFULLSOL.js"></script> <script type="text/javascript"> var _hbEC=0,_hbE=new Array;function _hbEvent(a,b){b=_hbE[_hbEC++]=new Object();b._N=a;b._C=0;return b;} var hbx=_hbEvent("pv");hbx.vpc="HBX0100u";hbx.gn="ngd.thesun.co.uk"; // set vars to be used below var urlReturnCid = ""; var urlReturnAttr = ""; // First, we load the URL into a variable var url = window.location.href; // Next, split the url by the # var qparts = url.split("#"); // Check that there is a querystring if (qparts.length == 2) { // Set the second half of string to var var query = qparts[1]; // Next, split that string by the & var varQ = query.split("&"); if (varQ.length == 2) { // Lastly split by = and assign to vars var retQ1 = varQ[0].split("="); var retQ2 = varQ[1].split("="); urlReturnCid = retQ1[1]; urlReturnAttr = retQ2[1]; } } //BEGIN EDITABLE SECTION //CONFIGURATION VARIABLES hbx.pn="Yobs+on+film+tauntingbrtragic+neighbour+PRTF-2891313"; s.events=""; s.pageName="SOL_PRTF_2891313 /News"; s.channel="/Home/News"; s.prop1="Home"; s.prop2="/Home/News"; s.prop3="/Home/News"; s.prop4="SOL"; s.prop5="PRTF"; s.prop6="Yobs on film tauntingbrtragic neighbour_PRTF"; s.prop15=""; s.prop16="2891313"; s.prop19=""; s.prop20=""; s.prop25=""; s.campaign=""; s.hier2="/Home/News"; s.eVar15=""; hbx.mlc = "/Home/News"; hbx.acct = "DM5403272HDE;DM5406146PDA"; hbx.pndef="title";//DEFAULT PAGE NAME hbx.ctdef="full";//DEFAULT CONTENT CATEGORY //OPTIONAL PAGE VARIABLES //ACTION SETTINGS hbx.fv="";//FORM VALIDATION MINIMUM ELEMENTS OR SUBMIT FUNCTION NAME hbx.lt="manual";//LINK TRACKING hbx.dlf="n";//DOWNLOAD FILTER hbx.dft="n";//DOWNLOAD FILE NAMING hbx.elf="n";//EXIT LINK FILTER //SEGMENTS AND FUNNELS hbx.seg="";//VISITOR SEGMENTATION hbx.fnl="";//FUNNELS //CAMPAIGNS hbx.cmp=urlReturnCid;//CAMPAIGN ID hbx.cmpn="";//CAMPAIGN ID IN QUERY hbx.dcmp="";//DYNAMIC CAMPAIGN ID hbx.dcmpn="";//DYNAMIC CAMPAIGN ID IN QUERY hbx.dcmpe="";//DYNAMIC CAMPAIGN EXPIRATION hbx.dcmpre="";//DYNAMIC CAMPAIGN RESPONSE EXPIRATION hbx.hra=urlReturnAttr;//RESPONSE ATTRIBUTE hbx.hqsr="";//RESPONSE ATTRIBUTE IN REFERRAL QUERY hbx.hqsp="";//RESPONSE ATTRIBUTE IN QUERY hbx.hlt="";//LEAD TRACKING hbx.hla="";//LEAD ATTRIBUTE hbx.gp="";//CAMPAIGN GOAL hbx.gpn="";//CAMPAIGN GOAL IN QUERY hbx.hcn="";//CONVERSION ATTRIBUTE hbx.hcv="";//CONVERSION VALUE hbx.cp="null";//LEGACY CAMPAIGN hbx.cpd="";//CAMPAIGN DOMAIN //CUSTOM VARIABLES hbx.ci="";//CUSTOMER ID hbx.hc1="";//CUSTOM 1 hbx.hc2="";//CUSTOM 2 hbx.hc3="";//CUSTOM 3 hbx.hc4="";//CUSTOM 4 hbx.hrf="";//CUSTOM REFERRER hbx.pec="";//ERROR CODES //INSERT CUSTOM EVENTS //END EDITABLE SECTION </script> <script language="JavaScript" type="text/javascript"><!-- /************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/ var s_code=s.t();if(s_code)document.write(s_code)//--></script> <script language="JavaScript" type="text/javascript"><!-- if(navigator.appVersion.indexOf('MSIE')>=0)document.write(unescape('%3C')+'\!-'+'-') //--></script><!--/DO NOT REMOVE/--> <!-- End SiteCatalyst code version: H.20.3. --> </script><script src="/js/hbx.js" type="text/javascript"></script> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7"/> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1" /> <title>Print Friendly Page</title> <link rel="shortcut icon" href="favicon.ico" type="image/x-icon" /> <script src="/js/jquery.js" type="text/javascript"></script> <script src="/js/cufon/cufon-yui.js" type="text/javascript"></script> <script src="/js/cufon/cufon-font-thc.js" type="text/javascript"></script> <script src="/js/sol.js" type="text/javascript"></script> <style type="text/css" media="screen"> @import "/css/sol.css"; html { overflow-x:hidden; overflow-y:scroll; } </style> <style type="text/css" media="print"> @import "/css/sol-print.css"; </style> </head> <body class="print-friendly"> <div id="content-print"> <div id="column-print" class="bg-fff"> <BEAN:define id="publication" name="publication" type="neo.xredsys.api.Publication" /> <div class="clear width-625 bg-fff padding-top-10"> <div class="print-friendly-hidden text-center padding-bottom-5"> <a href="#" onclick="window.print();return false;" name="&lid=PrintImage&lpos=Print"><img src="/img/buttons/btn-print.gif" alt="Print" /></a></div> <h2 class="bg-c00 text-fff margin-bottom-5"><img src="/img/text/print-logo-the-sun.gif" alt="Print" width="87" height="32" /></h2> <div class="clear bg-fff margin-bottom-10"> <div class="padding-bottom-2 black-solid-line"></div> <h2 class="text-uppercase padding-left-2">News</h2> <div class="padding-bottom-5 black-solid-line"></div> </div> </div> <div class="width-682"> <div class="padding-bottom-5"> <div class="clear padding-top-10"></div> <h1 class="medium-centered"> Yobs on film taunting<br/>tragic neighbour </h1> </div> <div class="text-center"> <div id="ltbx100" class="ltbx-slideshow"> <div class="ltbx-loader" style="width:682px;"> <div class="ltbx-img"> <img src="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_1__682_1003992a.jpg" style="width:682px;height:400px;" alt="Loading"/> </div> <div class="ltbx-load-layer" style="margin-top:-400px;width:682px;height:400px;"> <img style="margin-top:184.0px;" src="/img/lightbox/loading.gif" alt="Loading Animation"/> </div> <div class="ltbx-label" style="width:682px;"> Torment ... David Askew confronted by yobs in his garden </div> </div> <div id="k100r1c1t5w682h400" class="ltbx-gallery"> <p class="ltbx-var ltbx-hbxpn">ltbx2890110</p> <p class="ltbx-var ltbx-gap-height">40</p> <p class="ltbx-var ltbx-nav-loop">1</p> <p class="ltbx-var ltbx-bk-pad">0</p> <p class="ltbx-var ltbx-url">/sol/</p> <p class="ltbx-var ltbx-logo">1</p> <div class="ltbx-container"> <div class="ltbx-scroller"> <div class="ltbx-group"> <div class="ltbx-block"> <a title="Torment ... David Askew confronted by yobs in his garden" href="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_1__682_1003992a.jpg" class="ltbx-img" style="width:682px;height:400px;"> <img src="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_1__682_1003992a.jpg" alt="Torment ... David Askew confronted by yobs in his garden"/> <div class="ltbx-msg"> <div class="ltbx-tab"></div><div class="icon-slideshow ltbx-icon">Slideshows </div> </div> </a> <div class="ltbx-label" style="width:682px;"> Torment ... David Askew confronted by yobs in his garden </div> </div> <div class="ltbx-block"> <a title="Protection ... CCTV had been installed out the back of Askew's house" href="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_2__1003990a.jpg" class="ltbx-img" style="width:682px;height:400px;"> <img src="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_2__1003990a.jpg" alt="Protection ... CCTV had been installed out the back of Askew's house"/> <div class="ltbx-msg"> <div class="ltbx-tab"></div><div class="icon-slideshow ltbx-icon">Slideshows </div> </div> </a> <div class="ltbx-label" style="width:682px;"> Protection ... CCTV had been installed out the back of Askew's house </div> </div> <div class="ltbx-block"> <a title="Tragic ... shots show Askew clearly being taunted by yob" href="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_3__1003991a.jpg" class="ltbx-img" style="width:682px;height:400px;"> <img src="http://img.thesun.co.uk/multimedia/archive/01003/David-Askew_3__1003991a.jpg" alt="Tragic ... shots show Askew clearly being abused by yobTragic ... shots show Askew clearly being abused by yob"/> <div class="ltbx-msg"> <div class="ltbx-tab"></div><div class="icon-slideshow ltbx-icon">Slideshows </div> </div> </a> <div class="ltbx-label" style="width:682px;"> Tragic ... shots show Askew clearly being taunted by yob </div> </div> <div class="clr"></div> </div> </div> </div> <div class="ltbx-nav"> <div class="ltbx-lft"></div> <div class="ltbx-pager"></div> <div class="ltbx-rgt"></div> </div> </div> <img class="ltbx-load-init" src="/img/global/spacer.gif" alt="spacer" onload="jCsl.init({id:100,on:true});" /> </div> </div> <div class="clear-left"> <div> <div class="clear-left"> <p class="display-byline"> By STEWART WHITTINGHAM </p> <div class="padding-top-10 padding-bottom-10 clear-left"> <div class="center-div width-280"> <p class="display-byline">Published: Today</p> </div> </div> <div class="clear-left"></div> </div> <div class="clear-left padding-bottom-7"></div> </div> <h2 class="padding-bottom-7" style="font-size: 1.05em; line-height: 1.05em;"> SHOCKING footage of the bullying that drove a man to his death emerged yesterday. </h2><p class="article"></p><p class="article"><div style="width:180px" class="margin-top-5 margin-right-10 padding-bottom-5 float-left"><img src="http://img.thesun.co.uk/multimedia/archive/01003/SNN1211_2__1003560a.jpg" border="0" alt="david askew" title="david askew" /><div class="img-cap">Hounded ... David the 'gent'</div></div><p class="article"> It captured tragic David Askew crying in anguish as yobs tormented him. </p><p class="article"></p><p class="article"> Their victim, who had learning difficulties, collapsed and died outside his home this week when his gate was smashed after 20 years' intimidation by the Mad Dogs Gang - some aged just six. </p><p class="article"></p><p class="article"> He was pelted with bricks and hounded for cash and cigs by the thugs on his estate in Hattersley, Greater Manchester. </p><p class="article"> Neighbour Lynne Barker, 47, who filmed his ordeal seven years ago, said: "David suffered hell because of these kids. </p><p class="article"></p><p class="article"> "They threw stones, abused him and threatened him as he had a mental age of eight and was an easy target." </p><p class="article"> Her film also shows David, 64, who had CCTV installed, biting his hand in frustration then trying to escape the gang. </p><p class="article"> One hoodie said yesterday: "We all did it. But I know now he must have been scared." </p><p class="article"> Kial Cottingham, 18, is due in court today after being charged with harassment. </p><p class="article"> A lad of 18 arrested on suspicion of manslaughter on Thursday has been bailed until June 7 </p><p class="article"></p><p class="article"></p><p class="article"></p><p class="article"> <object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/06ElR5Ydetk&hl=en_GB&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/06ElR5Ydetk&hl=en_GB&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object> </p><p class="article"></p><p class="article"></p><p class="article" align="right"><a href="mailto: s.whittingham@the-sun.co.uk"target="_self" title="s.whittingham@the-sun.co.uk">s.whittingham@the-sun.co.uk</a></p> <div class="float-left width-300 padding-right-10 padding-bottom-10 padding-top-10 "> <!-- null --> </div> <!-- Article End --> <div class="float-left" id="chicklets-panel"> <style> #column2 {overflow: hidden;} </style> <img src="http://img.thesun.co.uk/multimedia/archive/00429/spacer_429055a.gif" width="389" height="1" /> <script type="text/javascript" language="JavaScript"> <!-- var showurl; function urlencode() { var newURL; var tempstr; var URL = location.href; var len = URL.length; for (j=0;j<len;j++) { tempstr = URL.charCodeAt(j); if (j == 0) newURL = escape(String.fromCharCode(tempstr)); else newURL = newURL + escape(String.fromCharCode(tempstr)); } return newURL; } function digg() { showurl = urlencode(); window.open("http://digg.com/submit?phase=2&url=" + showurl, "digg", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function delicious() { showurl = urlencode(); window.open("http://del.icio.us/post?url=" + showurl, "digg", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function reddit() { showurl = urlencode(); window.open("http://reddit.com/submit?url=" + showurl, "digg" ,"width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1") } function newsvine() { showurl = urlencode(); window.open("http://www.newsvine.com/_tools/seed&save?u=" + showurl, "digg", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function nowpublic() { showurl = urlencode(); window.open("http://view.nowpublic.com/?src=" + showurl, "digg", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function facebook() { showurl = urlencode(); window.open("http://www.facebook.com/share.php?u=" + showurl, "facebook", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function fark() { showurl = urlencode(); window.open("http://cgi.fark.com/cgi/fark/farkit.pl?h=" + document.title + "&u=" + showurl, "fark", "width=470,height=452,status=1,toolbar=1,location=1,scrollbars=1,menubar=1,resizable=1"); } function myspace() { showurl = urlencode(); window.open("http://www.myspace.com/Modules/PostTo/Pages/?u=" + showurl + "&t=" + document.title + "&c=" + document.title + "&l=3"); } --> </script> <div style="width:100%; background:white; text-align:left; border:0 solid black;overflow: hidden;"> <div style="border-bottom:silver 0 solid; width:100%; clear:both; margin-bottom:5px; padding:0; font-size:0.8em; line-height:1.7em; float: left;"><b>Share this article </b><a onclick="fPopUp(500,500,'http://extras.thesun.co.uk/share_this_article/copy.htm'); return false;" href="javascript:">What is this?</a> </div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:digg();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Digg__372738a.gif" alt="DIGG" hspace="3" align="left" valign="middle" style="padding-right:2px;" />Digg it!</a></div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:delicious();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/del_ic_ious_372739a.gif" alt="DEL.ICIO.US" hspace="3" align="left" valign="middle" style="padding-right:2px;" />del.icio.us</a></div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:myspace();" style="font-size:0.8em; line-height:1.7em;" title="MySpace"> <img src="http://x.myspace.com/images/myspace_logo_16.gif" alt="MYSPACE" title="MySpace" hspace="3" align="left" valign="middle" style="padding-right:2px;" />MySpace</a></div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:facebook();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Facebook_372737a.gif" alt="FACEBOOK" hspace="3" align="left" valign="middle" style="padding-right:2px;" />Facebook</a></div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:fark();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Fark_372736a.gif" alt="FARK" hspace="3" border="0" align="left" valign="middle" style="padding-right:2px;" />Fark</a></div> <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:reddit();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Readit_372723a.gif" alt="REDDIT" hspace="3" align="left" valign="middle" style="padding-right:2px;" />Reddit</a></div> <!-- <div style="height:30px; float:left; margin-right:16px;"><a href="javascript:void(0);" onclick="javascript:newsvine();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Newsvine_372735a.gif" alt="NEWSVINE" hspace="3" align="left" valign="middle" style="padding-right:2px;" />Newsvine</a></div> --> <preform> <div style="height:30px; float:left; margin-right:14px; font-size: 12px; margin-top: 3px;"> <script> document.write("<scr"+"ipt type=\"text/javasc"+"ript\" src=\"http://d.yimg.com/ds/badge2.js\" badgetype=\"text\">"+location.href+"</scr"+"ipt>"); </script> </div> </preform> <div style="height:30px; float:left; margin-right:14px;"><a href="javascript:void(0);" onclick="javascript:nowpublic();" style="font-size:0.8em; line-height:1.7em;"> <img src="http://www.thesun.co.uk/multimedia/archive/00372/Nowpublic_372724a.gif" alt="NOWPUBLIC" hspace="3" align="left" valign="middle" style="padding-right:2px;" />NowPublic</a></div> <div style="clear: both;"></div> </div> </div> <div class="clear"></div> </div> <div class="padding-top-10"></div> <div class="clear text-center small padding-left-right-5 text-999 padding-top-5 padding-bottom-10 grey-solid-line"> <em> <p> © 2009 News Group Newspapers Ltd. "The Sun", "Sun", "Sun Online" are registered trademarks or trade names of News Group Newspapers Limited. This service is provided on News Group Newspapers' <a target="_parent" href="/sol/homepage/hygiene/terms_conditions/article254101.ece">Standard Terms and Conditions</a> in accordance with our <a target="_blank" href="http://www.nidp.com/">Privacy Policy</a> . To inquire about a licence to reproduce material, visit our <a target="_blank" href="http://www.thesun.co.uk/sol/homepage/article2636543.ece">Syndication site</a> . View our online Press Pack. For other inquiries, <a target="_parent" href="/sol/homepage/hygiene/contact_us/article251760.ece">Contact Us</a> . To see all content on The Sun, please use the <a target="_parent" href="/sol/homepage/hygiene/site_map/">Site Map</a>. </p> <p style="text-align: left;"><a href='http://the-acap.org/acap-enabled.php' border='0' target='new'><img src='http://img.thesun.co.uk/multimedia/archive/00607/acap_enabled_small__607912a.gif' border="0" /></a></p> <script type="text/javascript"> var nTopSearchTimeDelay = 0; </script> <style> input#mast-head-search-text {max-height: 18px;} #masthead-search {margin-top:1px; *margin-top:0px;} </style> </em> </div> </div> </div> </body> </html> Thanks. |
![]() |
![]() |
#1589 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
I'm posting a new recipe for gocomics.com. It has an image size adjustment, so you can try it and reduce the image size to get it to fit your reader. It has some of the same comics as comics.com. |
|
![]() |
![]() |
#1590 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
New recipe: GoComics.com
Here is the gocomics.com recipe I promised. It has 200+ comics, about 3/4 general and 1/4 political/editorial.
There are 3 options that need to be set, so this recipe is best used as a custom recipe. They are: the comic strips to get, the number of past issues/days of the strip to get, and the comic image size. The default has 7 days of 15 comics (10 general and 5 editorial) at a moderately large size. I pulled the last 99 images of Calvin and Hobbes as a test. It retrieved them all. The main C&H page has a link to the first strip back in the 1980's, but I have no idea if all strips are available. Please be conservative when setting the options so you don't overload their servers and only get comics you will actually read. Anyone setting a daily retrieve for all 200+ comics with 1000 back issues at maximum image size will be severely frowned upon by Those-Who-Frown (and their computer will lock up for the next few years.) Enjoy! Code:
#!/usr/bin/env python __license__ = 'GPL v3' __copyright__ = 'Copyright 2010 Starson17' ''' www.gocomics.com ''' from calibre.web.feeds.news import BasicNewsRecipe from calibre.ebooks.BeautifulSoup import BeautifulSoup, Tag, NavigableString import urllib, re, mechanize class GoComics(BasicNewsRecipe): title = 'GoComics' __author__ = 'Starson17' __version__ = '1.01' __date__ = '13 March 2010' description = '200+ Comics - Customize for more days/comics: Defaults to 7 days, 15 comics - 10 general, 5 editorial.' language = 'en' use_embedded_content= False no_stylesheets = True remove_javascript = True cover_url = 'http://paulbuckley14059.files.wordpress.com/2008/06/calvin-and-hobbes.jpg' ####### USER PREFERENCES - COMICS, IMAGE SIZE AND NUMBER OF COMICS TO RETRIEVE ######## # num_comics_to_get - I've tried up to 99 on Calvin&Hobbes num_comics_to_get = 7 # comic_size 300 is small, 600 is medium, 900 is large, 1500 is extra-large comic_size = 1200 # CHOOSE COMIC STRIPS BELOW - REMOVE COMMENT '# ' FROM IN FRONT OF DESIRED STRIPS # Please do not overload their servers by selecting all comics and 1000 strips from each! keep_only_tags = [dict(name='div', attrs={'class':['feature','banner']}), ] remove_tags = [dict(name='a', attrs={'class':['beginning','prev','cal','next','newest']}), dict(name='div', attrs={'class':['tag-wrapper']}), dict(name='ul', attrs={'class':['share-nav','feature-nav']}), ] def get_browser(self): br = BasicNewsRecipe.get_browser(self) orig_open_novisit = br.open_novisit def my_open_no_visit(url, **kwargs): req = mechanize.Request( url, headers = { 'Referer':'http://www.gocomics.com/', }) return orig_open_novisit(req) br.open_novisit = my_open_no_visit return br def parse_index(self): feeds = [] for title, url in [ ######## COMICS - GENERAL ######## # (u"2 Cows and a Chicken", u"http://www.gocomics.com/2cowsandachicken"), # (u"9 to 5", u"http://www.gocomics.com/9to5"), # (u"The Academia Waltz", u"http://www.gocomics.com/academiawaltz"), # (u"Adam@Home", u"http://www.gocomics.com/adamathome"), # (u"Agnes", u"http://www.gocomics.com/agnes"), # (u"Andy Capp", u"http://www.gocomics.com/andycapp"), # (u"Animal Crackers", u"http://www.gocomics.com/animalcrackers"), # (u"Annie", u"http://www.gocomics.com/annie"), # (u"The Argyle Sweater", u"http://www.gocomics.com/theargylesweater"), # (u"Ask Shagg", u"http://www.gocomics.com/askshagg"), (u"B.C.", u"http://www.gocomics.com/bc"), # (u"Back in the Day", u"http://www.gocomics.com/backintheday"), # (u"Bad Reporter", u"http://www.gocomics.com/badreporter"), # (u"Baldo", u"http://www.gocomics.com/baldo"), # (u"Ballard Street", u"http://www.gocomics.com/ballardstreet"), # (u"Barkeater Lake", u"http://www.gocomics.com/barkeaterlake"), # (u"The Barn", u"http://www.gocomics.com/thebarn"), # (u"Basic Instructions", u"http://www.gocomics.com/basicinstructions"), # (u"Bewley", u"http://www.gocomics.com/bewley"), # (u"Big Top", u"http://www.gocomics.com/bigtop"), # (u"Biographic", u"http://www.gocomics.com/biographic"), # (u"Birdbrains", u"http://www.gocomics.com/birdbrains"), # (u"Bleeker: The Rechargeable Dog", u"http://www.gocomics.com/bleeker"), # (u"Bliss", u"http://www.gocomics.com/bliss"), (u"Bloom County", u"http://www.gocomics.com/bloomcounty"), # (u"Bo Nanas", u"http://www.gocomics.com/bonanas"), # (u"Bob the Squirrel", u"http://www.gocomics.com/bobthesquirrel"), # (u"The Boiling Point", u"http://www.gocomics.com/theboilingpoint"), # (u"Boomerangs", u"http://www.gocomics.com/boomerangs"), # (u"The Boondocks", u"http://www.gocomics.com/boondocks"), # (u"Bottomliners", u"http://www.gocomics.com/bottomliners"), # (u"Bound and Gagged", u"http://www.gocomics.com/boundandgagged"), # (u"Brainwaves", u"http://www.gocomics.com/brainwaves"), # (u"Brenda Starr", u"http://www.gocomics.com/brendastarr"), # (u"Brewster Rockit", u"http://www.gocomics.com/brewsterrockit"), # (u"Broom Hilda", u"http://www.gocomics.com/broomhilda"), (u"Calvin and Hobbes", u"http://www.gocomics.com/calvinandhobbes"), # (u"Candorville", u"http://www.gocomics.com/candorville"), # (u"Cathy", u"http://www.gocomics.com/cathy"), # (u"C'est la Vie", u"http://www.gocomics.com/cestlavie"), # (u"Chuckle Bros", u"http://www.gocomics.com/chucklebros"), # (u"Citizen Dog", u"http://www.gocomics.com/citizendog"), # (u"The City", u"http://www.gocomics.com/thecity"), # (u"Cleats", u"http://www.gocomics.com/cleats"), # (u"Close to Home", u"http://www.gocomics.com/closetohome"), # (u"Compu-toon", u"http://www.gocomics.com/compu-toon"), # (u"Cornered", u"http://www.gocomics.com/cornered"), # (u"Cul de Sac", u"http://www.gocomics.com/culdesac"), # (u"Daddy's Home", u"http://www.gocomics.com/daddyshome"), # (u"Deep Cover", u"http://www.gocomics.com/deepcover"), # (u"Dick Tracy", u"http://www.gocomics.com/dicktracy"), # (u"The Dinette Set", u"http://www.gocomics.com/dinetteset"), # (u"Dog Eat Doug", u"http://www.gocomics.com/dogeatdoug"), # (u"Domestic Abuse", u"http://www.gocomics.com/domesticabuse"), # (u"Doodles", u"http://www.gocomics.com/doodles"), # (u"Doonesbury", u"http://www.gocomics.com/doonesbury"), # (u"The Doozies", u"http://www.gocomics.com/thedoozies"), # (u"The Duplex", u"http://www.gocomics.com/duplex"), # (u"Eek!", u"http://www.gocomics.com/eek"), # (u"The Elderberries", u"http://www.gocomics.com/theelderberries"), # (u"Flight Deck", u"http://www.gocomics.com/flightdeck"), # (u"Flo and Friends", u"http://www.gocomics.com/floandfriends"), # (u"The Flying McCoys", u"http://www.gocomics.com/theflyingmccoys"), (u"For Better or For Worse", u"http://www.gocomics.com/forbetterorforworse"), # (u"For Heaven's Sake", u"http://www.gocomics.com/forheavenssake"), # (u"Fort Knox", u"http://www.gocomics.com/fortknox"), # (u"FoxTrot", u"http://www.gocomics.com/foxtrot"), (u"FoxTrot Classics", u"http://www.gocomics.com/foxtrotclassics"), # (u"Frank & Ernest", u"http://www.gocomics.com/frankandernest"), # (u"Fred Basset", u"http://www.gocomics.com/fredbasset"), # (u"Free Range", u"http://www.gocomics.com/freerange"), # (u"Frog Applause", u"http://www.gocomics.com/frogapplause"), # (u"The Fusco Brothers", u"http://www.gocomics.com/thefuscobrothers"), (u"Garfield", u"http://www.gocomics.com/garfield"), # (u"Garfield Minus Garfield", u"http://www.gocomics.com/garfieldminusgarfield"), # (u"Gasoline Alley", u"http://www.gocomics.com/gasolinealley"), # (u"Gil Thorp", u"http://www.gocomics.com/gilthorp"), # (u"Ginger Meggs", u"http://www.gocomics.com/gingermeggs"), # (u"Girls & Sports", u"http://www.gocomics.com/girlsandsports"), # (u"Haiku Ewe", u"http://www.gocomics.com/haikuewe"), # (u"Heart of the City", u"http://www.gocomics.com/heartofthecity"), # (u"Heathcliff", u"http://www.gocomics.com/heathcliff"), # (u"Herb and Jamaal", u"http://www.gocomics.com/herbandjamaal"), # (u"Home and Away", u"http://www.gocomics.com/homeandaway"), # (u"Housebroken", u"http://www.gocomics.com/housebroken"), # (u"Hubert and Abby", u"http://www.gocomics.com/hubertandabby"), # (u"Imagine This", u"http://www.gocomics.com/imaginethis"), # (u"In the Bleachers", u"http://www.gocomics.com/inthebleachers"), # (u"In the Sticks", u"http://www.gocomics.com/inthesticks"), # (u"Ink Pen", u"http://www.gocomics.com/inkpen"), # (u"It's All About You", u"http://www.gocomics.com/itsallaboutyou"), # (u"Joe Vanilla", u"http://www.gocomics.com/joevanilla"), # (u"La Cucaracha", u"http://www.gocomics.com/lacucaracha"), # (u"Last Kiss", u"http://www.gocomics.com/lastkiss"), # (u"Legend of Bill", u"http://www.gocomics.com/legendofbill"), # (u"Liberty Meadows", u"http://www.gocomics.com/libertymeadows"), # (u"Lio", u"http://www.gocomics.com/lio"), # (u"Little Dog Lost", u"http://www.gocomics.com/littledoglost"), # (u"Little Otto", u"http://www.gocomics.com/littleotto"), # (u"Loose Parts", u"http://www.gocomics.com/looseparts"), # (u"Love Is...", u"http://www.gocomics.com/loveis"), # (u"Maintaining", u"http://www.gocomics.com/maintaining"), # (u"The Meaning of Lila", u"http://www.gocomics.com/meaningoflila"), # (u"Middle-Aged White Guy", u"http://www.gocomics.com/middleagedwhiteguy"), # (u"The Middletons", u"http://www.gocomics.com/themiddletons"), # (u"Momma", u"http://www.gocomics.com/momma"), # (u"Mutt & Jeff", u"http://www.gocomics.com/muttandjeff"), # (u"Mythtickle", u"http://www.gocomics.com/mythtickle"), # (u"Nest Heads", u"http://www.gocomics.com/nestheads"), # (u"NEUROTICA", u"http://www.gocomics.com/neurotica"), # (u"New Adventures of Queen Victoria", u"http://www.gocomics.com/thenewadventuresofqueenvictoria"), (u"Non Sequitur", u"http://www.gocomics.com/nonsequitur"), # (u"The Norm", u"http://www.gocomics.com/thenorm"), # (u"On A Claire Day", u"http://www.gocomics.com/onaclaireday"), # (u"One Big Happy", u"http://www.gocomics.com/onebighappy"), # (u"The Other Coast", u"http://www.gocomics.com/theothercoast"), # (u"Out of the Gene Pool Re-Runs", u"http://www.gocomics.com/outofthegenepool"), # (u"Overboard", u"http://www.gocomics.com/overboard"), # (u"Pibgorn", u"http://www.gocomics.com/pibgorn"), # (u"Pibgorn Sketches", u"http://www.gocomics.com/pibgornsketches"), (u"Pickles", u"http://www.gocomics.com/pickles"), # (u"Pinkerton", u"http://www.gocomics.com/pinkerton"), # (u"Pluggers", u"http://www.gocomics.com/pluggers"), # (u"Pooch Cafe", u"http://www.gocomics.com/poochcafe"), # (u"PreTeena", u"http://www.gocomics.com/preteena"), # (u"The Quigmans", u"http://www.gocomics.com/thequigmans"), # (u"Rabbits Against Magic", u"http://www.gocomics.com/rabbitsagainstmagic"), # (u"Real Life Adventures", u"http://www.gocomics.com/reallifeadventures"), # (u"Red and Rover", u"http://www.gocomics.com/redandrover"), # (u"Red Meat", u"http://www.gocomics.com/redmeat"), # (u"Reynolds Unwrapped", u"http://www.gocomics.com/reynoldsunwrapped"), # (u"Ronaldinho Gaucho", u"http://www.gocomics.com/ronaldinhogaucho"), # (u"Rubes", u"http://www.gocomics.com/rubes"), # (u"Scary Gary", u"http://www.gocomics.com/scarygary"), (u"Shoe", u"http://www.gocomics.com/shoe"), # (u"Shoecabbage", u"http://www.gocomics.com/shoecabbage"), # (u"Skin Horse", u"http://www.gocomics.com/skinhorse"), # (u"Slowpoke", u"http://www.gocomics.com/slowpoke"), # (u"Speed Bump", u"http://www.gocomics.com/speedbump"), # (u"State of the Union", u"http://www.gocomics.com/stateoftheunion"), # (u"Stone Soup", u"http://www.gocomics.com/stonesoup"), # (u"Strange Brew", u"http://www.gocomics.com/strangebrew"), # (u"Sylvia", u"http://www.gocomics.com/sylvia"), # (u"Tank McNamara", u"http://www.gocomics.com/tankmcnamara"), # (u"Tiny Sepuku", u"http://www.gocomics.com/tinysepuku"), # (u"TOBY", u"http://www.gocomics.com/toby"), # (u"Tom the Dancing Bug", u"http://www.gocomics.com/tomthedancingbug"), # (u"Too Much Coffee Man", u"http://www.gocomics.com/toomuchcoffeeman"), # (u"W.T. Duck", u"http://www.gocomics.com/wtduck"), # (u"Watch Your Head", u"http://www.gocomics.com/watchyourhead"), # (u"Wee Pals", u"http://www.gocomics.com/weepals"), # (u"Winnie the Pooh", u"http://www.gocomics.com/winniethepooh"), (u"Wizard of Id", u"http://www.gocomics.com/wizardofid"), # (u"Working It Out", u"http://www.gocomics.com/workingitout"), # (u"Yenny", u"http://www.gocomics.com/yenny"), # (u"Zack Hill", u"http://www.gocomics.com/zackhill"), # (u"Ziggy", u"http://www.gocomics.com/ziggy"), ######## COMICS - EDITORIAL ######## # ("Lalo Alcaraz","http://www.gocomics.com/laloalcaraz"), # ("Nick Anderson","http://www.gocomics.com/nickanderson"), # ("Chuck Asay","http://www.gocomics.com/chuckasay"), # ("Tony Auth","http://www.gocomics.com/tonyauth"), # ("Donna Barstow","http://www.gocomics.com/donnabarstow"), # ("Bruce Beattie","http://www.gocomics.com/brucebeattie"), # ("Clay Bennett","http://www.gocomics.com/claybennett"), # ("Lisa Benson","http://www.gocomics.com/lisabenson"), # ("Steve Benson","http://www.gocomics.com/stevebenson"), # ("Chip Bok","http://www.gocomics.com/chipbok"), # ("Steve Breen","http://www.gocomics.com/stevebreen"), # ("Chris Britt","http://www.gocomics.com/chrisbritt"), # ("Stuart Carlson","http://www.gocomics.com/stuartcarlson"), # ("Ken Catalino","http://www.gocomics.com/kencatalino"), # ("Paul Conrad","http://www.gocomics.com/paulconrad"), # ("Jeff Danziger","http://www.gocomics.com/jeffdanziger"), # ("Matt Davies","http://www.gocomics.com/mattdavies"), # ("John Deering","http://www.gocomics.com/johndeering"), # ("Bob Gorrell","http://www.gocomics.com/bobgorrell"), # ("Walt Handelsman","http://www.gocomics.com/walthandelsman"), # ("Clay Jones","http://www.gocomics.com/clayjones"), # ("Kevin Kallaugher","http://www.gocomics.com/kevinkallaugher"), # ("Steve Kelley","http://www.gocomics.com/stevekelley"), # ("Dick Locher","http://www.gocomics.com/dicklocher"), # ("Chan Lowe","http://www.gocomics.com/chanlowe"), ("Mike Luckovich","http://www.gocomics.com/mikeluckovich"), # ("Gary Markstein","http://www.gocomics.com/garymarkstein"), # ("Glenn McCoy","http://www.gocomics.com/glennmccoy"), # ("Jim Morin","http://www.gocomics.com/jimmorin"), # ("Jack Ohman","http://www.gocomics.com/jackohman"), ("Pat Oliphant","http://www.gocomics.com/patoliphant"), # ("Joel Pett","http://www.gocomics.com/joelpett"), ("Ted Rall","http://www.gocomics.com/tedrall"), # ("Michael Ramirez","http://www.gocomics.com/michaelramirez"), # ("Marshall Ramsey","http://www.gocomics.com/marshallramsey"), # ("Steve Sack","http://www.gocomics.com/stevesack"), # ("Ben Sargent","http://www.gocomics.com/bensargent"), # ("Drew Sheneman","http://www.gocomics.com/drewsheneman"), # ("John Sherffius","http://www.gocomics.com/johnsherffius"), ("Small World","http://www.gocomics.com/smallworld"), # ("Scott Stantis","http://www.gocomics.com/scottstantis"), # ("Wayne Stayskal","http://www.gocomics.com/waynestayskal"), # ("Dana Summers","http://www.gocomics.com/danasummers"), # ("Paul Szep","http://www.gocomics.com/paulszep"), # ("Mike Thompson","http://www.gocomics.com/mikethompson"), ("Tom Toles","http://www.gocomics.com/tomtoles"), # ("Gary Varvel","http://www.gocomics.com/garyvarvel"), # ("ViewsAfrica","http://www.gocomics.com/viewsafrica"), # ("ViewsAmerica","http://www.gocomics.com/viewsamerica"), # ("ViewsAsia","http://www.gocomics.com/viewsasia"), # ("ViewsBusiness","http://www.gocomics.com/viewsbusiness"), ("ViewsEurope","http://www.gocomics.com/viewseurope"), # ("ViewsLatinAmerica","http://www.gocomics.com/viewslatinamerica"), # ("ViewsMidEast","http://www.gocomics.com/viewsmideast"), # ("Views of the World","http://www.gocomics.com/viewsoftheworld"), # ("Kerry Waghorn","http://www.gocomics.com/facesinthenews"), # ("Dan Wasserman","http://www.gocomics.com/danwasserman"), # ("Signe Wilkinson","http://www.gocomics.com/signewilkinson"), # ("Wit of the World","http://www.gocomics.com/witoftheworld"), # ("Don Wright","http://www.gocomics.com/donwright"), ]: articles = self.make_links(url) if articles: feeds.append((title, articles)) return feeds def make_links(self, url): title = 'Temp' description = '' date = '' current_articles = [] pages = range(1, self.num_comics_to_get+1) for page in pages: page_soup = self.index_to_soup(url) if page_soup: strip_title = page_soup.h1.a.string date_title = page_soup.find('ul', attrs={'class': 'feature-nav'}).li.string title = strip_title + ' - ' + date_title strip_url_date = page_soup.h1.a['href'] prev_strip_url_date = page_soup.find('a', attrs={'class': 'prev'})['href'] page_url = 'http://www.gocomics.com' + strip_url_date prev_page_url = 'http://www.gocomics.com' + prev_strip_url_date current_articles.append({'title': title, 'url': page_url, 'description':'', 'date':''}) url = prev_page_url return current_articles def preprocess_html(self, soup): if soup.title: title_string = soup.title.string.strip() _cd = title_string.split(',',1)[1] comic_date = ' '.join(_cd.split(' ', 4)[0:-1]) if soup.h1.span: artist = soup.h1.span.string soup.h1.span.string.replaceWith(comic_date + artist) feature_item = soup.find('p',attrs={'class':'feature_item'}) if feature_item.a: a_tag = feature_item.a a_href = a_tag["href"] img_tag = a_tag.img img_tag["src"] = a_href img_tag["width"] = self.comic_size img_tag["height"] = None return soup extra_css = ''' h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;} h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;} p{font-family:Arial,Helvetica,sans-serif;font-size:small;} body{font-family:Helvetica,Arial,sans-serif;font-size:small;} ''' |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |