Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-19-2011, 03:12 PM   #16
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
remove_attributes=['style']
and add style to remove_tags
kovidgoyal is offline   Reply With Quote
Old 06-19-2011, 03:23 PM   #17
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
Quote:
Originally Posted by kovidgoyal View Post
remove_attributes=['style']
and add style to remove_tags
I honestly thought that was it as the navbar moved left in calibre - but the sony still crashes... sigh
scissors is offline   Reply With Quote
Advert
Old 06-19-2011, 04:09 PM   #18
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
what's left in stylesheet.css after you've removed all the style info from the download?
kovidgoyal is offline   Reply With Quote
Old 06-19-2011, 04:28 PM   #19
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
contents of stylesheet.css

Spoiler:

@namespace h "http://www.w3.org/1999/xhtml";
.Author {
display: block;
font-size: 1.29167em;
font-weight: bold;
line-height: 1.2;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em
}
.Content {
display: table-cell;
padding-bottom: 1px;
padding-left: 1px;
padding-right: 1px;
padding-top: 1px;
text-align: inherit;
vertical-align: inherit
}
.article {
color: blue;
cursor: pointer;
font-size: 1.2em;
font-weight: bold;
line-height: 1.2;
text-align: left;
text-decoration: underline
}
.articledate {
color: gray;
font-family: monospace
}
.articledescription {
display: block;
font-size: 0.7em;
text-indent: 0
}
.calibre {
display: block;
font-size: 1em;
margin-bottom: 0;
margin-left: 5pt;
margin-right: 5pt;
margin-top: 0;
padding-left: 0;
padding-right: 0
}
.calibre1 {
display: block;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
text-align: center
}
.calibre10 {
display: block;
font-size: 1.66667em;
font-weight: bold;
line-height: 1.2;
margin-bottom: 0.83em;
margin-left: 0;
margin-right: 0;
margin-top: 0.83em
}
.calibre11 {
font-family: monospace;
line-height: 1.2
}
.calibre12 {
font-family: monospace
}
.calibre13 {
background-color: #FFF;
display: inline-block;
height: 170px;
width: 170px
}
.calibre14 {
background-color: #FFF;
height: 113px;
padding-bottom: 28.5px;
padding-top: 28.5px;
width: 170px
}
.calibre15 {
display: block;
font-weight: bold;
margin-bottom: 1.33em;
margin-left: 0;
margin-right: 0;
margin-top: 1.33em
}
.calibre16 {
display: block;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
max-width: 100%;
overflow: hidden;
text-align: left
}
.calibre17 {
background-color: #FFF;
height: 170px;
padding-left: 29px;
padding-right: 29px;
width: 112px
}
.calibre18 {
background-color: #FFF;
height: 127px;
padding-bottom: 21.5px;
padding-top: 21.5px;
width: 170px
}
.calibre19 {
background-color: #FFF;
height: 170px;
padding-left: 4.5px;
padding-right: 4.5px;
width: 161px
}
.calibre2 {
height: auto;
width: auto
}
.calibre20 {
background-color: #FFF;
height: 170px;
padding-left: 28.5px;
padding-right: 28.5px;
width: 113px
}
.calibre21 {
background-color: #FFF;
height: 170px;
padding-left: 17.5px;
padding-right: 17.5px;
width: 135px
}
.calibre22 {
background-color: #FFF;
height: 112px;
padding-bottom: 29px;
padding-top: 29px;
width: 170px
}
.calibre23 {
font-size: 1em
}
.calibre3 {
display: block;
margin-bottom: 1em;
margin-left: 0;
margin-right: 0;
margin-top: 1em;
text-align: right
}
.calibre4 {
display: list-item
}
.calibre5 {
display: block
}
.calibre6 {
color: blue;
cursor: pointer;
text-decoration: underline
}
.calibre7 {
border: 1px inset;
color: gray;
display: block;
height: 2px;
margin-bottom: 0.5em;
margin-left: auto;
margin-right: auto;
margin-top: 0.5em
}
.calibre8 {
font-weight: bolder
}
.calibre9 {
font-style: italic
}
.calibrefeeddescription {
display: block;
font-size: 0.8em
}
.calibrefeedlist {
display: block;
list-style-type: disc;
margin-bottom: 1em;
margin-right: 0;
margin-top: 1em
}
.calibrefeedtitle {
display: block;
font-size: 1.6em;
font-weight: bold;
line-height: 1.2;
margin-bottom: 0.83em;
margin-left: 0;
margin-right: 0;
margin-top: 0.83em
}
.calibrenavbar {
display: block;
font-family: monospace;
font-size: 0.7em;
text-align: center
}
.calibrerescale {
display: block;
font-size: 1em
}
.calibrerescale1 {
display: list-item;
font-size: 1em;
padding-bottom: 0.5em
}
.feed {
color: blue;
cursor: pointer;
font-size: 1.2em;
font-weight: bold;
line-height: 1.2;
text-decoration: underline
}


Last edited by scissors; 06-19-2011 at 04:32 PM.
scissors is offline   Reply With Quote
Old 06-19-2011, 04:50 PM   #20
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
Quote:
Originally Posted by kovidgoyal View Post
Add

preprocess_regexps = [(re.compile('r<head.*?</head>', re.DOTALL), lambda m:'')]
Hi Kovid I noticed you put the ' before the r. I editted it to r' hoping it meant all between and including the <head> s weren't getting deleted.

it still crashes. :-(
scissors is offline   Reply With Quote
Advert
Old 06-19-2011, 05:35 PM   #21
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's way too much style information, it's still getting style info from somewhere. Look at the html in the input sub directory of the debug dir to see where it is coming from.
kovidgoyal is offline   Reply With Quote
Old 06-25-2011, 03:26 AM   #22
scissors
Addict
scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.scissors ought to be getting tired of karma fortunes by now.
 
Posts: 241
Karma: 1001369
Join Date: Sep 2010
Device: prs300, kindle keyboard 3g
HI All

I had an hour to have another look at this. After trying many string patterns i decided to start again.

Basically what seems to be happening is there are a list of feeds such as

http//.....News/General/Southport-festival-welcomes-campers/_ch3_nw1427
and equivent print versions
http//.....News/General/Southport-festival-welcomes-campers/Print-_ch3_nw1427

Using Starson's example I use
myurl = url.replace('/_', '/Print-_')

and I use the print command to check ,when running, the job log that the complete URL is correct - it seems to be.

The problem is the resultant downloaded html is (i'm pretty sure) the non print one. I think the print version gets thrown away.

I've included the recipe if someone could help - I'm pretty sure although it doesn't error, it must be wrong.

The Recipe
Spoiler:

import time, re
class AdvancedUserRecipe1306061239(BasicNewsRecipe):
title = u'Out and about live 2'
description = 'Camping and Caravan - News and Reviews'

author = 'Dave Asbury'

#cover_url= 'http://www.outandaboutlive.co.uk/img/template/footer/illustration_3.jpg'
#masthead_url = 'http://www.outandaboutlive.co.uk/img/template/cloud_logo.gif'

oldest_article = 56
max_articles_per_feed = 3
remove_empty_feeds = True
remove_javascript = True
no_stylesheets = True

def print_version(self,url):
myurl = url.replace('/_', '/Print-_')
print 'New URL =' ,myurl
return myurl


feeds = [(u'Camping News', u'http://feeds.feedburner.com/OAL/News/Camping')
]





The Job log file

Spoiler:


Fetch news from Out and about live 2
Resolved conversion options
calibre version: 0.8.7
{'asciiize': False,
'author_sort': None,
'authors': None,
'base_font_size': 0,
'book_producer': None,
'change_justification': 'original',
'chapter': None,
'chapter_mark': 'pagebreak',
'comments': None,
'cover': None,
'debug_pipeline': None,
'dehyphenate': True,
'delete_blank_paragraphs': True,
'disable_font_rescaling': False,
'dont_download_recipe': False,
'dont_split_on_page_breaks': True,
'enable_heuristics': False,
'epub_flatten': False,
'extra_css': None,
'extract_to': None,
'fix_indents': True,
'flow_size': 260,
'font_size_mapping': None,
'format_scene_breaks': True,
'html_unwrap_factor': 0.4,
'input_encoding': None,
'input_profile': <calibre.customize.profiles.InputProfile object at 0x0566A210>,
'insert_blank_line': False,
'insert_metadata': False,
'isbn': None,
'italicize_common_cases': True,
'keep_ligatures': False,
'language': None,
'level1_toc': None,
'level2_toc': None,
'level3_toc': None,
'line_height': 0,
'linearize_tables': False,
'lrf': False,
'margin_bottom': 5.0,
'margin_left': 5.0,
'margin_right': 5.0,
'margin_top': 5.0,
'markup_chapter_headings': True,
'max_toc_links': 50,
'minimum_line_height': 120.0,
'no_chapters_in_toc': False,
'no_default_epub_cover': False,
'no_inline_navbars': False,
'no_svg_cover': False,
'output_profile': <calibre.customize.profiles.SonyReader300Output object at 0x0566A5F0>,
'page_breaks_before': None,
'password': None,
'prefer_metadata_cover': False,
'preserve_cover_aspect_ratio': False,
'pretty_print': True,
'pubdate': None,
'publisher': None,
'rating': None,
'read_metadata_from_opf': None,
'remove_fake_margins': True,
'remove_first_image': False,
'remove_paragraph_spacing': False,
'remove_paragraph_spacing_indent_size': 1.5,
'renumber_headings': True,
'replace_scene_breaks': '',
'series': None,
'series_index': None,
'smarten_punctuation': False,
'sr1_replace': '',
'sr1_search': '',
'sr2_replace': '',
'sr2_search': '',
'sr3_replace': '',
'sr3_search': '',
'tags': None,
'test': False,
'timestamp': None,
'title': None,
'title_sort': None,
'toc_filter': None,
'toc_threshold': 6,
'unwrap_lines': True,
'use_auto_toc': False,
'username': None,
'verbose': 2}
InputFormatPlugin: Recipe Input running
Synthesizing mastheadImage
New URL = http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1445
New URL = http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1436
New URL = http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1427
Downloading
Fetching http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1445
Downloading
Fetching http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1436
Downloading
Fetching http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1427
Processing images...
Processing images...
Fetching http://www.sptag2.com/rwtag.gif?rw.p...t=nojavascript
Processing images...
Fetching http://www.sptag2.com/rwtag.gif?rw.p...t=nojavascript
Fetching http://www.sptag2.com/rwtag.gif?rw.p...t=nojavascript
Fetching http://www.outandaboutlive.co.uk/img...cloud_logo.gif
Fetching http://www.outandaboutlive.co.uk/img...cloud_logo.gif
Fetching http://www.outandaboutlive.co.uk/img...cloud_logo.gif
Fetching http://www.outandaboutlive.co.uk/use...rist_board.gif
Fetching http://www.outandaboutlive.co.uk/use...rist_board.gif
Fetching http://www.outandaboutlive.co.uk/use...rist_board.gif
Fetching http://www.outandaboutlive.co.uk/use.../news/festival goers at main stage_wychwood 2011_408639_419669.jpg
Fetching http://www.outandaboutlive.co.uk/img/dot.gif
Fetching http://www.outandaboutlive.co.uk/img/dot.gif
Fetching http://www.outandaboutlive.co.uk/use.../news/festival goers at main stage_wychwood 2011_408639_419668.jpg
Fetching http://www.outandaboutlive.co.uk/use...eCS_405393.gif
Fetching http://www.outandaboutlive.co.uk/use...116_414328.jpg
Fetching http://www.outandaboutlive.co.uk/use...ue3_404861.gif
Fetching http://www.outandaboutlive.co.uk/use...116_414328.jpg
Fetching http://www.outandaboutlive.co.uk/userfiles/news/PR Pictures 268_409089.jpg
Fetching http://www.outandaboutlive.co.uk/use...ogo_404408.jpg
Fetching http://www.outandaboutlive.co.uk/userfiles/news/PR Pictures 268_409089.jpg
Fetching http://www.outandaboutlive.co.uk/use...val_408116.jpg
Fetching http://www.outandaboutlive.co.uk/userfiles/news/George Archer Safari Tents Teversal 280_404183.jpg
Fetching http://www.outandaboutlive.co.uk/use...val_408116.jpg
Fetching http://www.outandaboutlive.co.uk/use...ogo_388205.jpg
Fetching http://www.outandaboutlive.co.uk/use...ogo_388205.jpg
Fetching http://www.outandaboutlive.co.uk/use...ogo_388205.jpg
Fetching http://www.outandaboutlive.co.uk/use...ogo_388210.gif
Fetching http://www.outandaboutlive.co.uk/use...ogo_388210.gif
Fetching http://www.outandaboutlive.co.uk/use...ogo_388210.gif
Fetching http://www.outandaboutlive.co.uk/use.../magazines/cam July11 web_407773.jpg
Fetching http://www.outandaboutlive.co.uk/use.../magazines/cam July11 web_407773.jpg
Fetching http://www.outandaboutlive.co.uk/use.../magazines/cam July11 web_407773.jpg
Fetching http://www.outandaboutlive.co.uk/img...mall_right.png
Fetching http://www.outandaboutlive.co.uk/img...mall_right.png
Fetching http://www.outandaboutlive.co.uk/img...mall_right.png
Fetching http://www.outandaboutlive.co.uk/use...ue3_406561.gif
Fetching http://www.outandaboutlive.co.uk/use...ue3_406561.gif
Fetching http://www.outandaboutlive.co.uk/use...ue3_406561.gif
Fetching http://www.outandaboutlive.co.uk/use...skyscraper.gif
Fetching http://www.outandaboutlive.co.uk/use...skyscraper.gif
Fetching http://www.outandaboutlive.co.uk/use...skyscraper.gif
Fetching http://www.outandaboutlive.co.uk/use..._160x160px.gif
Fetching http://www.outandaboutlive.co.uk/use..._160x160px.gif
Fetching http://www.outandaboutlive.co.uk/use..._160x160px.gif
Fetching http://www.outandaboutlive.co.uk/use...revised(1).gif
Fetching http://www.outandaboutlive.co.uk/use...revised(1).gif
Fetching http://www.outandaboutlive.co.uk/use...revised(1).gif
Fetching http://www.outandaboutlive.co.uk/img...r_top_link.gif
Fetching http://www.outandaboutlive.co.uk/img...r_top_link.gif
Fetching http://www.outandaboutlive.co.uk/img...r_top_link.gif
Fetching http://www.outandaboutlive.co.uk/img...ooter_made.gif
Fetching http://www.outandaboutlive.co.uk/img...ooter_made.gif
Fetching http://www.outandaboutlive.co.uk/img...ooter_made.gif
Recursion limit reached. Skipping links in http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1436
Recursion limit reached. Skipping links in http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1427
Recursion limit reached. Skipping links in http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1445
http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1436 saved to d:\temp\calibre_0.8.7_tmp_4ysm4z\calibre_0.8.7_7lh grt_plumber\feed_0\article_1\index.xhtml
http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1427 saved to d:\temp\calibre_0.8.7_tmp_4ysm4z\calibre_0.8.7_7lh grt_plumber\feed_0\article_2\index.xhtml
http://www.outandaboutlive.co.uk/Cam...nt-_ch3_nw1445 saved to d:\temp\calibre_0.8.7_tmp_4ysm4z\calibre_0.8.7_7lh grt_plumber\feed_0\article_0\index.xhtml
Downloaded article: Female festival goers turn to glamping option from http://www.outandaboutlive.co.uk/Cam...on/_ch3_nw1436
Downloaded article: Southport festival welcomes campers from http://www.outandaboutlive.co.uk/Cam...rs/_ch3_nw1427
Downloaded article: Family tent buying guide launched from http://www.outandaboutlive.co.uk/Cam...ed/_ch3_nw1445
Parsing all content...
Parsing feed_0/article_2/index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass
File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665)
File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932)
File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709)
File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083)
File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739)
File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614)
File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955)
XMLSyntaxError: xmlParseEntityRef: no name, line 186, column 19

Parsing file 'feed_0/article_2/index.html' as HTML
Forcing feed_0/article_2/index.html into XHTML namespace
Parsing feed_0/index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass
File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665)
File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932)
File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709)
File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083)
File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739)
File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614)
File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955)
XMLSyntaxError: Opening and ending tag mismatch: br line 32 and div, line 33, column 7

Parsing file 'feed_0/index.html' as HTML
Forcing feed_0/index.html into XHTML namespace
Parsing index.html ...
Forcing index.html into XHTML namespace
Parsing feed_0/article_1/index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass
File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665)
File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932)
File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709)
File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083)
File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739)
File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614)
File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955)
XMLSyntaxError: xmlParseEntityRef: no name, line 209, column 19

Parsing file 'feed_0/article_1/index.html' as HTML
Forcing feed_0/article_1/index.html into XHTML namespace
Parsing feed_0/article_0/index.html ...
Initial parse failed:
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass
File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665)
File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932)
File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709)
File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083)
File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739)
File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614)
File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955)
XMLSyntaxError: xmlParseEntityRef: no name, line 188, column 19

Parsing file 'feed_0/article_0/index.html' as HTML
Forcing feed_0/article_0/index.html into XHTML namespace
Referenced file '/user/register.asp' not found
Referenced file '/user/default.asp%3fredirecturl%3dhttp%3a/www.outandaboutlive.co.uk/channel/newsItem.asp%3fc%3d3%26cate%3d__1445%26print%3d1' not found
Referenced file '/user/default.asp%3fredirecturl%3dhttp%3a/www.outandaboutlive.co.uk/channel/newsItem.asp%3fc%3d3%26cate%3d__1427%26print%3d1' not found
Referenced file '/img/template/footer_search_but.gif' not found
Referenced file '/img/template/primenu_search_but.gif' not found
Referenced file '/' not found
Referenced file '/user/default.asp%3fredirecturl%3dhttp%3a/www.outandaboutlive.co.uk/channel/newsItem.asp%3fc%3d3%26cate%3d__1436%26print%3d1' not found
Referenced file '/user/e-newsletter.asp' not found
Referenced file 'feed_1/index.html' not found
Referenced file '/search/results.asp' not found
Reading TOC from NCX...
Merging user specified metadata...
Detecting structure...
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Parsing stylesheet.css ...
Found 60 items of level: div_9
Found 36 items of level: div_8
Found 9 items of level: div_1
Found 2 items of level: div_2
Found 72 items of level: div_5
Found 75 items of level: div_4
Found 21 items of level: div_7
Found 6 items of level: div_6
Found 19 items of level: div_10
Found 15 items of level: p_10
Found 5 items of level: p_2
Ignoring level p_10
Ignoring level div_10
Ignoring level p_2
Ignoring level div_7
Ignoring level div_6
div_9 left margin stats: Counter({u'': 60})
div_9 right margin stats: Counter({u'': 60})
div_8 left margin stats: Counter({u'': 36})
div_8 right margin stats: Counter({u'': 36})
div_1 left margin stats: Counter({u'': 1})
div_1 right margin stats: Counter({u'': 1})
div_2 left margin stats: Counter()
div_2 right margin stats: Counter()
div_5 left margin stats: Counter({u'': 72})
div_5 right margin stats: Counter({u'': 72})
div_4 left margin stats: Counter({u'': 75})
div_4 right margin stats: Counter({u'': 75})
Cleaning up manifest...
Trimming unused files from manifest...
Trimming 'feed_0/article_1/images/img1.jpg' from manifest
Trimming 'feed_0/article_2/images/img1.jpg' from manifest
Trimming 'feed_0/article_0/images/img1.jpg' from manifest
Creating EPUB Output...
Found non-unique filenames, renaming to support broken EPUB readers like FBReader, Aldiko and Stanza...
{'feed_0/article_0/images/img10.jpg': 'feed_0/article_0/images/img10_u2.jpg',
'feed_0/article_0/images/img12.jpg': 'feed_0/article_0/images/img12_u2.jpg',
'feed_0/article_0/images/img15.jpg': 'feed_0/article_0/images/img15_u1.jpg',
'feed_0/article_0/images/img2.jpg': 'feed_0/article_0/images/img2_u1.jpg',
'feed_0/article_0/images/img4.jpg': 'feed_0/article_0/images/img4_u1.jpg',
'feed_0/article_0/images/img5.jpg': 'feed_0/article_0/images/img5_u2.jpg',
'feed_0/article_0/index.html': 'feed_0/article_0/index_u4.html',
'feed_0/article_1/images/img11.jpg': 'feed_0/article_1/images/img11_u2.jpg',
'feed_0/article_1/images/img12.jpg': 'feed_0/article_1/images/img12_u1.jpg',
'feed_0/article_1/images/img13.jpg': 'feed_0/article_1/images/img13_u2.jpg',
'feed_0/article_1/images/img14.jpg': 'feed_0/article_1/images/img14_u2.jpg',
'feed_0/article_1/images/img15.jpg': 'feed_0/article_1/images/img15_u2.jpg',
'feed_0/article_1/images/img16.jpg': 'feed_0/article_1/images/img16_u1.jpg',
'feed_0/article_1/images/img17.jpg': 'feed_0/article_1/images/img17_u2.jpg',
'feed_0/article_1/images/img2.jpg': 'feed_0/article_1/images/img2_u2.jpg',
'feed_0/article_1/images/img3.jpg': 'feed_0/article_1/images/img3_u1.jpg',
'feed_0/article_1/images/img4.jpg': 'feed_0/article_1/images/img4_u2.jpg',
'feed_0/article_1/images/img5.jpg': 'feed_0/article_1/images/img5_u1.jpg',
'feed_0/article_1/images/img6.jpg': 'feed_0/article_1/images/img6_u2.jpg',
'feed_0/article_1/images/img7.jpg': 'feed_0/article_1/images/img7_u1.jpg',
'feed_0/article_1/images/img8.jpg': 'feed_0/article_1/images/img8_u1.jpg',
'feed_0/article_1/images/img9.jpg': 'feed_0/article_1/images/img9_u1.jpg',
'feed_0/article_1/index.html': 'feed_0/article_1/index_u3.html',
'feed_0/article_2/images/img10.jpg': 'feed_0/article_2/images/img10_u1.jpg',
'feed_0/article_2/images/img11.jpg': 'feed_0/article_2/images/img11_u1.jpg',
'feed_0/article_2/images/img13.jpg': 'feed_0/article_2/images/img13_u1.jpg',
'feed_0/article_2/images/img14.jpg': 'feed_0/article_2/images/img14_u1.jpg',
'feed_0/article_2/images/img16.jpg': 'feed_0/article_2/images/img16_u2.jpg',
'feed_0/article_2/images/img17.jpg': 'feed_0/article_2/images/img17_u1.jpg',
'feed_0/article_2/images/img18.jpg': 'feed_0/article_2/images/img18_u1.jpg',
'feed_0/article_2/images/img3.jpg': 'feed_0/article_2/images/img3_u2.jpg',
'feed_0/article_2/images/img6.jpg': 'feed_0/article_2/images/img6_u1.jpg',
'feed_0/article_2/images/img7.jpg': 'feed_0/article_2/images/img7_u2.jpg',
'feed_0/article_2/images/img8.jpg': 'feed_0/article_2/images/img8_u2.jpg',
'feed_0/article_2/images/img9.jpg': 'feed_0/article_2/images/img9_u2.jpg',
'feed_0/index.html': 'feed_0/index_u1.html',
'index.html': 'index_u2.html'}
Rescaling image from 728x90 to 562x69 feed_0/article_1/images/img3_u1.jpg
Rescaling image from 728x90 to 562x69 feed_0/article_2/images/img3_u2.jpg
Rescaling image from 728x90 to 562x69 feed_0/article_0/images/img3.jpg
Rescaling image from 600x60 to 562x56 mastheadImage.jpg
Rescaling image from 590x750 to 562x715 cover.jpg
Looking for large trees in feed_0/article_1/index_u3.html...
No large trees found
Looking for large trees in feed_0/article_2/index.html...
No large trees found
Looking for large trees in feed_0/article_0/index_u4.html...
No large trees found
Looking for large trees in feed_0/index_u1.html...
No large trees found
Looking for large trees in index_u2.html...
No large trees found
The cover image has an id != "cover". Renaming to work around bug in Nook Color
EPUB output written to d:\temp\calibre_0.8.7_tmp_4ysm4z\calibre_0.8.7_bti ugl_recipe_out.epub


Last edited by scissors; 06-25-2011 at 06:53 AM.
scissors is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
For Testing: Roger Ebert (movie reviews) Recipe spedinfargo Recipes 5 02-19-2011 09:32 PM
Recipe for KA-News.de tfeld Recipes 0 12-30-2010 05:45 PM
Help with news recipe Acey Calibre 2 03-12-2010 06:36 AM
Gadget Lab Hardware News and Reviews Amazon Dumps Sprint for Kindle 2, Embraces AT&T DMcCunney News 2 10-26-2009 12:10 PM
PRS-505 reviews: CNET (7/10), ABC News TadW Sony Reader 0 11-15-2007 10:59 AM


All times are GMT -4. The time now is 04:26 PM.


MobileRead.com is a privately owned, operated and funded community.