i am having trouble with my recipe:
Spoiler:
Code:
class AdvancedUserRecipe1283848012(BasicNewsRecipe):
description = 'TheMarker'
cover_url = 'http://static.ispot.co.il/wp-content/upload/2009/09/themarker.jpg'
title = u'The Marker1'
language = 'he'
simultaneous_downloads = 1
delay = 6
remove_javascript = True
timefmt = '[%a, %d %b, %Y]'
oldest_article = 1
remove_tags = [dict(name='tr', attrs={'bgcolor':['#738A94']}) ]
max_articles_per_feed = 1000
extra_css='body{direction: rtl;} .article_description{direction: rtl; } a.article{direction: rtl; } .calibre_feed_description{direction: rtl; }'
feeds = [(u'Head Lines', u'http://www.themarker.com/tmc/content/xml/rss/hpfeed.xml'), (u'TA Market', u'http://www.themarker.com/tmc/content/xml/rss/sections/marketfeed.xml'), (u'Real Estate', u'http://www.themarker.com/tmc/content/xml/rss/sections/realEstaterfeed.xml'), (u'Wall Street & Global', u'http://www.themarker.com/tmc/content/xml/rss/sections/wallsfeed.xml'), (u'Law', u'http://www.themarker.com/tmc/content/xml/rss/sections/lawfeed.xml'), (u'Media', u'http://www.themarker.com/tmc/content/xml/rss/sections/mediafeed.xml'), (u'Consumer', u'http://www.themarker.com/tmc/content/xml/rss/sections/consumerfeed.xml'), (u'Career', u'http://www.themarker.com/tmc/content/xml/rss/sections/careerfeed.xml'), (u'Car', u'http://www.themarker.com/tmc/content/xml/rss/sections/carfeed.xml'), (u'High Tech', u'http://www.themarker.com/tmc/content/xml/rss/sections/hightechfeed.xml'), (u'Investor Guide', u'http://www.themarker.com/tmc/content/xml/rss/sections/investorGuidefeed.xml')]
def print_version(self, url):
baseURL=url.replace('tmc/article.jhtml?ElementId=', 'ibo/misc/printFriendly.jhtml?ElementId=%2Fibo%2Frepositories%2Fstories%2Fm1_2000%2F')
s= baseURL + '.xml'
return s
I ran ebook-convert and I think this is the relevant output:
Spoiler:
Parsing file 'feed_0/index.html' as HTML
Forcing feed_0/index.html into XHTML namespace
Parsing feed_1/article_0/index.html ...
Forcing feed_1/article_0/index.html into XHTML namespace
Parsing feed_1/article_1/index.html ...
Forcing feed_1/article_1/index.html into XHTML namespace
Parsing index.html ...
Forcing index.html into XHTML namespace
Parsing feed_1/index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 816, in first_pass
File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48270)
File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:71812)
File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:70608)
File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67148)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63824)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64745)
File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64088)
XMLSyntaxError: Opening and ending tag mismatch: hr line 29 and div, line 30, column 7
Parsing file 'feed_1/index.html' as HTML
Forcing feed_1/index.html into XHTML namespace
Referenced file u'/tmc/i/newsmap/ajax_indicator.gif' not found
Referenced file u'/tmc/i/newsmap/pixel_off.gif' not found
Referenced file u'feed_0/article_0/stylesheets/i/msn/hp4_channels_top_bg.gif' not found
Referenced file u'/tmc/i/newsmap/right_off.gif' not found
Referenced file u'/tmc/i/newsmap/tp_left.gif' not found
Referenced file u'/tmc/i/tags/close_off.gif' not found
Referenced file u'/tmc/i/marketing/hakrishim/bgr_text_main.gif' not found
Referenced file u'/tmc/i/newsmap/tp_pixel.gif' not found
Referenced file u'feed_0/article_0/stylesheets/i/msn/hp4_channels_bottom_bg.gif' not found
Referenced file u'/tmc/i/tags/pixel_off.gif' not found
Referenced file u'/tmc/i/newsmap/left_on.gif' not found
Referenced file u'/tmc/i/c/greyDot.gif' not found
Referenced file u'/tmc/i/tags/bg2.gif' not found
Referenced file u'/tmc/i/article/indicator_medium.gif' not found
Referenced file u'feed_0/article_1/stylesheets/i/msn/hp4_channels_bottom_bg.gif' not found
Referenced file u'/tmc/i/dollar/back.jpg' not found
Referenced file u'/tmc/i/tags/right_off.gif' not found
Referenced file u'/tmc/i/newsmap/left_off.gif' not found
Referenced file u'/tmc/i/marketing/hakrishim/bgr_text_Krishim.gif' not found
Referenced file u'/tmc/i/newsmap/tp_right.gif' not found
Referenced file u'/tmc/i/newsmap/pixel_on.gif' not found
Referenced file u'/tmc/i/tags/pixel_on.gif' not found
Referenced file u'feed_0/article_1/stylesheets/i/msn/hp4_top_search_bg.gif' not found
Referenced file u'/tmc/i/tags/close_on.gif' not found
Referenced file u'/tmc/i/tags/right_on.gif' not found
Referenced file u'/tmc/i/tags/left_on.gif' not found
Referenced file u'/tmc/i/newsmap/right_on.gif' not found
Referenced file u'/tmc/i/marketing/forecast/text_box.gif' not found
Referenced file u'feed_0/article_0/stylesheets/i/msn/hp4_top_search_bg.gif' not found
Referenced file 'feed_2/index.html' not found
Referenced file u'/tmc/i/tags/left_off.gif' not found
Referenced file u'feed_0/article_1/stylesheets/i/msn/hp4_channels_top_bg.gif' not found
Referenced file u'/tmc/i/tags/bg_footer1.gif' not found
Reading TOC from NCX...
34% Running transforms on ebook...
Merging user specified metadata...
Detecting structure...
if not, pleas tell me where to look.
and thank you starson for the help so far. i think this message was posted in orderly fation