![]() |
#1126 |
Connoisseur
![]() ![]() Posts: 50
Karma: 160
Join Date: Jan 2008
Location: Dewitt, MI
Device: Kindle Paperwhite 2021 / PC / iPad
|
Here is my first attempt at at custom recipe. It is for the German Language course feeds are DW-World.de. I will reuse this same recipe to access the DW-World news feeds, but this is the one I completed first.
I do have one small problem. At the top and bottom of every article are a set of (unwanted) links. The HTML source is: Code:
<p class="actionFooter"><a href="/dw/article/0,,4529629,00.html">DW-WORLD.DE</a><span>*|*</span><a href="javascript:window.print()">Drucken</a> </p> Tips on the best way to eliminate this would be much appreciated. I tried both "remove_tags" and "preprocess_regexps," but in both cases I managed to eliminate not only the offending code, but the entire content of the page. Ooops. Thanks much.. Paul Code:
#!/usr/bin/env python __license__ = 'GPL v3' __copyright__ = '2009, Less Paul <LessPaul at gmail.com>' ''' dw-world.de ''' from calibre.web.feeds.news import BasicNewsRecipe class DW_World_courses(BasicNewsRecipe): title = 'DW-World - German Courses' __author__ = 'LessPaul' description = "German language courses and lesson feeds from the multi-language German news site DW-World.de" publisher = 'Deutsche Welle' category = 'German, Language, Education' oldest_article = 30 max_articles_per_feed = 100 language = 'de' lang = 'de-DE' no_stylesheets = True use_embedded_content = False remove_javascript = True conversion_options = { 'tags' : category, 'publisher' : publisher, 'language' : lang } feeds = [(u'Deutsch als Fremdsprache', u'http://rss.dw-world.de/rdf/DKfeed_dkmix_de'), (u'Deutsch im Fokus', u'http://rss.dw-world.de/rdf/DKfeed_dif_de'), (u'Alltagsdeutsch', u'http://rss.dw-world.de/rdf/DKfeed_alltagsdeutsch_de'), (u'Wort der Woche', u'http://rss.dw-world.de/rdf/DKfeed_wortderwoche_de'), (u'Sprachbar', u'http://rss.dw-world.de/rdf/DKfeed_sprachbar_de'), (u'Stichwort', u'http://rss.dw-world.de/rdf/DKfeed_stichwort_de'), (u'Top-Thema mit Vokabeln', u'http://rss.dw-world.de/rdf/DKfeed_topthemamitvokabeln_de'), (u'Langsam gesprochene Nachrichten', u'http://rss.dw-world.de/rdf/DKfeed_lgn_de')] def print_version(self, url): target = url.rpartition('/')[2] print_url = 'http://www.dw-world.de/popups/popup_printcontent/' + target return print_url |
![]() |
![]() |
#1127 | |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Dec 2009
Device: nook
|
WSJ resolved
Quote:
|
|
![]() |
![]() |
#1128 |
Member
![]() Posts: 15
Karma: 10
Join Date: Jan 2010
Device: kindle2
|
Help ticket request for the Christian Science Monitor paper? Perhaps website has changed?
Thank you |
![]() |
![]() |
#1129 |
Member
![]() Posts: 23
Karma: 12
Join Date: Jan 2010
Location: Edinburgh, UK
Device: SONY PRS600, Apple iPhone 3G
|
hi Paul,
you can use one of the include/remove functions. as the tag is the same for both top and bottom just add: Code:
remove_tags = [dict(name='p', attrs={'class':'actionFooter'})] if you want more control on the look of the output add your own CSS. more useful stuff here: http://calibre-ebook.com/user_manual...ownloaded-html lorenzo |
![]() |
![]() |
#1130 |
Member
![]() Posts: 23
Karma: 12
Join Date: Jan 2010
Location: Edinburgh, UK
Device: SONY PRS600, Apple iPhone 3G
|
chr_mon.recipe fix
i haven't seen this one working before, therefore the solution provided in the attached might give you slightly different result; at least the pages are not blank!
Kovid, i couldn't find a ticket in the bug tracker, but hopefully is one more thing off your list ![]() |
![]() |
![]() |
#1131 |
Enthusiast
![]() Posts: 33
Karma: 10
Join Date: Dec 2009
Device: iphone
|
AJKD request
Anyone want to try and make a recipe for The American Journal of Kidney Disease? www.ajkd.org. I can pm whomever with account details...
|
![]() |
![]() |
#1132 | |
Vox calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
|
Quote:
|
|
![]() |
![]() |
#1133 |
Enthusiast
![]() Posts: 33
Karma: 10
Join Date: Dec 2009
Device: iphone
|
|
![]() |
![]() |
#1134 |
Groupie
![]() ![]() ![]() Posts: 165
Karma: 206
Join Date: Dec 2007
Location: Kansas City
Device: Kindle1, Kindle DX, Kindle DXG
|
PC Magazine recipe
I haven't had much luck trying to create a recipe for PC Magazine from scratch - I just end up with headlines and graphic boxes. Any help would be most appreciated.
|
![]() |
![]() |
#1135 |
Member
![]() Posts: 23
Karma: 12
Join Date: Jan 2010
Location: Edinburgh, UK
Device: SONY PRS600, Apple iPhone 3G
|
PC Mag recipe
have a look at this one; the product review feed is not included and it is a first go with nothing fancy under the hood...
i noticed a few things already which can be improved (i.e. some articles spanning more than 1 page, some pics which can be removed etc), but it should get you started! lorenzo |
![]() |
![]() |
#1136 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Jan 2010
Device: Sony Daily Edition
|
The News & Observer
Could someone please help me setup a recipe for the News & Observer from Raleigh, NC. I have tried for 3 days to get this to work, but I keep ending up with unwanted headings in the table of contents, and the text of the news stories shadowed with only the words "tool name" visible. The feeds that I am trying to retrieve are:
feeds = [ ('Cover', 'http://www.newsobserver.com/100/index.rss'), ('News', 'http://www.newsobserver.com/102/index.rss'), ('Politics', 'http://www.newsobserver.com/105/index.rss'), ('Business', 'http://www.newsobserver.com/104/index.rss'), ('Sports', 'http://www.newsobserver.com/103/index.rss'), ('College Sports', 'http://www.newsobserver.com/119/index.rss'), ('Lifestyles', 'http://www.newsobserver.com/106/index.rss'), ('Editorials', 'http://www.newsobserver.com/158/index.rss')] Any help would be appreciated. |
![]() |
![]() |
#1137 |
Groupie
![]() ![]() ![]() Posts: 165
Karma: 206
Join Date: Dec 2007
Location: Kansas City
Device: Kindle1, Kindle DX, Kindle DXG
|
|
![]() |
![]() |
#1138 | |
Vox calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
|
Quote:
Last edited by Krittika Goyal; 01-13-2010 at 01:04 PM. |
|
![]() |
![]() |
#1139 |
Member
![]() Posts: 10
Karma: 10
Join Date: Dec 2009
Location: Halifax, Nova Scotia
Device: Sony PRS-300
|
Hi,
Wondering if someone could please take a look at The Atlantic recipe. It doesn't seem to download anything but the menu. I see someone has already mentioned the same problem with the Christian Science Monitor. Thanks Brian Sony Pocket Edition |
![]() |
![]() |
#1140 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Jan 2010
Device: Sony Daily Edition
|
Thank you Krittika, the recipe for the News & Observer worked perfectly.
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |