09-01-2010, 11:42 PM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
Error on PRC > EPUB conversion
I'm trying to convert a PRC document to EPUB and I'm getting the following error. Anyone have any ideas what is causing this to happen? Is there any way to put some debugging in there to find out exactly which strings are causing the "All strings must be XML compatible" error? This document is fairly large so I'm not even sure where to start looking. Any help would be very much appreciated.
ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (2011ef) Convert book 1 of 1 (2011ef) Resolved conversion options calibre version: 0.7.16 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': 'c:\\users\\cg\\appdata\\local\\temp\\calibre_0.7. 16_tmp_gtmwgm\\calibre_0.7.16_bxxkp9.jpeg', 'debug_pipeline': None, 'disable_font_rescaling': False, 'dont_split_on_page_breaks': False, 'extra_css': None, 'extract_to': None, 'flow_size': 260, 'font_size_mapping': None, 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x04F70C30>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'max_toc_links': 50, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'no_svg_cover': False, 'output_profile': <calibre.customize.profiles.OutputProfile object at 0x04F70E10>, 'page_breaks_before': u"//*[name()='h1' or name()='h2']", 'prefer_metadata_cover': False, 'preprocess_html': False, 'preserve_cover_aspect_ratio': False, 'pretty_print': True, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': 'c:\\users\\cg\\appdata\\local\\temp\\calibre_0.7. 16_tmp_gtmwgm\\calibre_0.7.16__labo5.opf', 'remove_first_image': False, 'remove_footer': False, 'remove_header': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'series': None, 'series_index': None, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: MOBI Input running on C:\ef\mps\2011ef.prc Extracting text... Adding anchors... Extracting images... Cleaning up HTML... Parsing HTML... Malformed markup, parsing using BeautifulSoup MOBI markup appears to contain random bytes. Stripping. Extracting text... Adding anchors... Extracting images... Cleaning up HTML... Parsing HTML... Malformed markup, parsing using BeautifulSoup MOBI markup appears to contain random bytes. Stripping. Python function terminated unexpectedly All strings must be XML compatible: Unicode or ASCII, no NULL bytes (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 99, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 815, in run File "site-packages\calibre\customize\conversion.py", line 211, in __call__ File "site-packages\calibre\ebooks\mobi\input.py", line 27, in convert File "site-packages\calibre\ebooks\mobi\reader.py", line 333, in extract_content File "site-packages\lxml\html\soupparser.py", line 23, in fromstring File "site-packages\lxml\html\soupparser.py", line 67, in _parse File "site-packages\lxml\html\soupparser.py", line 77, in _convert_tree File "site-packages\lxml\html\soupparser.py", line 87, in _convert_children File "site-packages\lxml\html\soupparser.py", line 87, in _convert_children File "site-packages\lxml\html\soupparser.py", line 87, in _convert_children File "site-packages\lxml\html\soupparser.py", line 89, in _convert_children File "site-packages\lxml\html\soupparser.py", line 103, in _append_text File "lxml.etree.pyx", line 836, in lxml.etree._Element.tail.__set__ (src/lxml/lxml.etree.c:33020) File "apihelpers.pxi", line 667, in lxml.etree._setTailText (src/lxml/lxml.etree.c:15438) File "apihelpers.pxi", line 1242, in lxml.etree._utf8 (src/lxml/lxml.etree.c:19848) ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes |
09-02-2010, 01:08 AM | #2 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
A corrupted file.
|
Advert | |
|
09-02-2010, 01:11 AM | #3 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
Update: Darn you Kovid... sneaking in with the correct answer. |
|
09-02-2010, 07:32 PM | #4 |
Junior Member
Posts: 5
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
Thanks for that information, it appears that the .prc that MobiPocket is creating is not working in any other reader so definitely corrupt.
Actually, the only reason I was trying to convert from a .prc was an issue I'm having with calibre itself. I have a compilation of .htm files that I am trying to convert and am having a problem with the images appearing. If I convert one .htm file with all of the images in the same directory I have no problem. However, after building a table of contents with links to all of the other .htm files, everything works fine except that the images aren't showing. Is there something that I might be doing wrong? In the debug, the image properties show that it is trying to load the image from the same debug folder -- would that mean that calibre is not including the images when it loads the table of contents? |
09-02-2010, 07:41 PM | #5 |
Grand Sorcerer
Posts: 6,208
Karma: 16534692
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Normally with multi-html books, all you have to do is drag the single toc html into Calibre and Calibre will look after the rest by zipping up all the referenced files (html, images, css).
|
Advert | |
|
09-02-2010, 07:45 PM | #6 |
Junior Member
Posts: 5
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
Yep, that's what I did and it indexes everything perfectly but the images just won't load. Is there a setting that I'm missing somewhere?
|
09-02-2010, 07:49 PM | #7 |
Grand Sorcerer
Posts: 6,208
Karma: 16534692
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
If you look inside the zip that Calibre has created in your library, are the images present?
|
09-02-2010, 07:58 PM | #8 |
Junior Member
Posts: 5
Karma: 10
Join Date: Sep 2010
Device: Kindle
|
I didn't even think to do that! They definitely weren't in there but after adding them in everything compiles perfectly. Thank you!
|
09-02-2010, 08:30 PM | #9 |
Grand Sorcerer
Posts: 6,208
Karma: 16534692
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Well I'm glad to have helped. However, you shouldn't have needed to add them manually. Odd.
|
Tags |
conversion error, error, xml compatible |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
RTF -> EPUB conversion error | gondwild | Calibre | 3 | 01-16-2011 11:35 AM |
Conversion Error - pdf to epub | Quint | Calibre | 3 | 09-26-2010 09:06 PM |
prc to epub error | seema.the.doctor | Calibre | 4 | 08-22-2010 11:04 AM |
Error converting to PDF from EPub and PRC | gauravj | Calibre | 3 | 05-24-2010 02:07 AM |
epub conversion error | booksonthemove | Calibre | 3 | 02-15-2010 10:21 PM |