MobileRead Forums - View Single Post

Stash123 · 01-10-2011, 06:16 PM

Quote:

Originally Posted by jackie_w

I don't often do RTF to EPUB conversions but I just tried one. I can confirm an error if the Convert - StructureDetection - Preprocess input box is checked. When I uncheck it the conversion runs OK.

These are the error messages from the Job Details box

Spoiler:

Code:

ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (Porterhouse Blue)

Convert book 1 of 1 (Porterhouse Blue)
Resolved conversion options
calibre version: 0.7.38
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 11.0,
 'book_producer': None,
 'change_justification': u'justify',
 'chapter': u"//*[name()='h2' or name()='h3']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': 'c:\\docume~1\\jackies\\locals~1\\temp\\calibre_0.7.38_tmp_qdhqih\\calibre_0.7.38_i4ld9j.jpeg',
 'debug_pipeline': None,
 'disable_font_rescaling': False,
 'dont_split_on_page_breaks': False,
 'epub_flatten': False,
 'extra_css': u'p, div, li {margin-top:0.25em; margin-bottom:0;}',
 'extract_to': None,
 'flow_size': 260,
 'font_size_mapping': u'8, 9, 10, 11, 14, 16, 20, 33',
 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x048AF210>,
 'insert_blank_line': False,
 'insert_metadata': True,
 'isbn': None,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': u'//h:h2',
 'level2_toc': u'//h:h3',
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 10.0,
 'max_toc_links': 0,
 'minimum_line_height': 0.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x048AF610>,
 'page_breaks_before': u'/',
 'prefer_metadata_cover': False,
 'preprocess_html': True,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': 'c:\\docume~1\\jackies\\locals~1\\temp\\calibre_0.7.38_tmp_qdhqih\\calibre_0.7.38_lkgnz7.opf',
 'remove_first_image': False,
 'remove_footer': False,
 'remove_header': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'series': None,
 'series_index': None,
 'smarten_punctuation': True,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: RTF Input running
on C:\Calibre\Novels\Tom Sharpe\Porterhouse Blue (2749)\Porterhouse Blue - Tom Sharpe.rtf
Converting RTF to XML...
	Preprocessing to convert unicode characters
Parsing XML...
Converting XML to HTML...
*********  Preprocessing HTML  *********
Python function terminated unexpectedly
  'utf8' codec can't decode byte 0xc2 in position 0: unexpected end of data (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 103, in main
  File "site.py", line 85, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 107, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 853, in run
  File "site-packages\calibre\customize\conversion.py", line 216, in __call__
  File "site-packages\calibre\ebooks\rtf\input.py", line 299, in convert
  File "site-packages\calibre\ebooks\conversion\utils.py", line 195, in __call__
  File "site-packages\calibre\ebooks\conversion\utils.py", line 112, in get_word_count
  File "site-packages\calibre\utils\wordcount.py", line 85, in get_wordcount_obj
  File "site-packages\calibre\utils\wordcount.py", line 67, in get_wordcount
  File "site-packages\calibre\utils\wordcount.py", line 56, in nonj_len
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc2 in position 0: unexpected end of data

I tried your suggestion, un-checked the preprocess box, no effect - same conversion error message, same list of details.

The repeated, with the footer-detection box unchecked also - same error result.

Sorry...