01-24-2010, 08:40 PM | #1 |
Enthusiast
Posts: 35
Karma: 10
Join Date: Jan 2009
Location: USA
Device: EZ Reader, iPhone
|
Error converting from de-DRM mobi to epub
Here are the details to the error I'm receiving. I've done the exact same procedure for several other books with no problems, but for some reason I have several that are giving me this error. Does it mean anything to any of you? Thanks for your help.
Convert book 1 of 1 (Ten Big Ones) Resolved conversion options {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': 'c:\\docume~1\\hp_adm~1\\locals~1\\temp\\calibre_0 .6.35_3g4bci.jpeg', 'debug_pipeline': None, 'disable_font_rescaling': False, 'dont_justify': False, 'dont_split_on_page_breaks': False, 'extra_css': None, 'extract_to': None, 'flow_size': 260, 'font_size_mapping': None, 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x02964F70>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'max_toc_links': 50, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'output_profile': <calibre.customize.profiles.OutputProfile object at 0x0296A150>, 'page_breaks_before': u"//*[name()='h1' or name()='h2']", 'prefer_metadata_cover': False, 'preprocess_html': False, 'pretty_print': True, 'publisher': None, 'rating': None, 'read_metadata_from_opf': 'c:\\docume~1\\hp_adm~1\\locals~1\\temp\\calibre_0 .6.35_wm8hm6.opf', 'remove_first_image': False, 'remove_footer': False, 'remove_header': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'series': None, 'series_index': None, 'tags': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: MOBI Input running on C:\Documents and Settings\HP_Administrator\Library\Janet Evanovich\Ten Big Ones (86)\Ten Big Ones - Janet Evanovich.prc Extracting text... Adding anchors... Extracting images... Cleaning up HTML... Parsing HTML... Malformed markup, parsing using BeautifulSoup MOBI markup appears to contain random bytes. Stripping. Extracting text... Adding anchors... Extracting images... Cleaning up HTML... Parsing HTML... Malformed markup, parsing using BeautifulSoup MOBI markup appears to contain random bytes. Stripping. Python function terminated unexpectedly (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 99, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 745, in run File "site-packages\calibre\customize\conversion.py", line 211, in __call__ File "site-packages\calibre\ebooks\mobi\input.py", line 27, in convert File "site-packages\calibre\ebooks\mobi\reader.py", line 331, in extract_content File "site-packages\lxml-2.2.2-py2.6-win32.egg\lxml\html\soupparser.py", line 23, in fromstring File "site-packages\lxml-2.2.2-py2.6-win32.egg\lxml\html\soupparser.py", line 66, in _parse File "site-packages\beautifulsoup-3.1.0.1-py2.6.egg\BeautifulSoup.py", line 1499, in __init__ File "site-packages\beautifulsoup-3.1.0.1-py2.6.egg\BeautifulSoup.py", line 1230, in __init__ File "site-packages\beautifulsoup-3.1.0.1-py2.6.egg\BeautifulSoup.py", line 1263, in _feed File "HTMLParser.py", line 108, in feed File "HTMLParser.py", line 148, in goahead File "HTMLParser.py", line 226, in parse_starttag File "HTMLParser.py", line 301, in check_for_whole_start_tag File "HTMLParser.py", line 115, in error HTMLParser.HTMLParseError: malformed start tag, at line 100, column 49 |
01-24-2010, 08:47 PM | #2 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It means that the file in question has malformed markup, probably because of a bad dedrming
|
Advert | |
|
01-24-2010, 09:07 PM | #3 |
Enthusiast
Posts: 35
Karma: 10
Join Date: Jan 2009
Location: USA
Device: EZ Reader, iPhone
|
I even redownloaded and redid the de-DRM. Do you know what I might try to get it into epub? Several are doing this to me and the de-DRMing seems to go normally, but when I load the files into Calibre, it's a bust. Thanks for your help!
|
01-24-2010, 09:21 PM | #4 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
try using something else to convert it to html on the off chance that it tolerates whatever the problem is better.
|
07-11-2010, 10:43 PM | #5 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jul 2010
Device: kindle
|
FYI,
this is caused by calibre upgrading beautifulsoup library to 3.1 which switches the SGML parser to Python's HTML parser which is junk. http://www.crummy.com/software/Beaut...-problems.html Downgrade to 3.0.8 and it should solve your problems. |
Advert | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Error converting from epub to mobi | kornhill | Calibre | 5 | 06-28-2010 09:27 PM |
Error when converting from mobi to epub | daveldhu | Calibre | 1 | 06-05-2010 11:38 AM |
Error converting EPUB to MOBI | picardo | Calibre | 8 | 04-22-2010 09:45 AM |
ERROR when converting mobi to epub? help! (new at this) | dalpod | Calibre | 2 | 04-14-2010 12:52 PM |
Error when converting to Mobi and Epub | Amalthia | Calibre | 3 | 05-11-2009 04:18 AM |