|
|
Thread Tools | Search this Thread |
11-06-2012, 11:49 AM | #1 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
'utf8' codec can't decode byte 0xb1 in position 18: invalid start byte
This is the error I am getting.
calibre, version 0.9.5 (win32, isfrozen: True) Conversion Error: Failed: Convert book 1 of 1 (Means of Ascent) Convert book 1 of 1 (Means of Ascent) Resolved conversion options calibre version: 0.9.5 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|pr ologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': False, 'dont_compress': False, 'duplicate_links_in_toc': False, 'embed_font_family': None, 'enable_heuristics': False, 'extra_css': None, 'extract_to': None, 'filter_css': u'', 'fix_indents': True, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x0370C490>, 'insert_blank_line': False, 'insert_blank_line_size': 0.5, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'markup_chapter_headings': True, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'mobi_file_type': u'old', 'mobi_ignore_margins': False, 'mobi_keep_original_images': False, 'mobi_toc_at_start': False, 'no_chapters_in_toc': False, 'no_inline_navbars': True, 'no_inline_toc': False, 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x0370C7D0>, 'page_breaks_before': u'/', 'personal_doc': u'[PDOC]', 'prefer_author_sort': False, 'prefer_metadata_cover': False, 'pretty_print': False, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': u'C:\\Users\\m147146\\AppData\\Local\\Temp\\calibr e_0.9.5_tmp_of8z0m\\ltspdi.opf', 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': u'', 'search_replace': '[]', 'series': None, 'series_index': None, 'share_not_sync': False, 'smarten_punctuation': False, 'sr1_replace': None, 'sr1_search': None, 'sr2_replace': None, 'sr2_search': None, 'sr3_replace': None, 'sr3_search': None, 'start_reading_at': None, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'toc_title': None, 'unsmarten_punctuation': False, 'unwrap_lines': True, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: EPUB Input running on C:\Users\m147146\AppData\Local\Temp\calibre_0.9.5_ tmp_of8z0m\ty0jpc.epub Python function terminated unexpectedly 'utf8' codec can't decode byte 0xb1 in position 18: invalid start byte (Error Code: 1) Traceback (most recent call last): File "site.py", line 132, in main File "site.py", line 109, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 186, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 1000, in run File "site-packages\calibre\customize\conversion.py", line 239, in __call__ File "site-packages\calibre\ebooks\conversion\plugins\epub_in put.py", line 153, in convert File "site-packages\calibre\utils\zipfile.py", line 751, in __init__ File "site-packages\calibre\utils\zipfile.py", line 786, in _GetContents File "site-packages\calibre\utils\zipfile.py", line 847, in _RealGetContents File "site-packages\calibre\utils\zipfile.py", line 388, in _decodeFilename File "encodings\utf_8.py", line 16, in decode UnicodeDecodeError: 'utf8' codec can't decode byte 0xb1 in position 18: invalid start byte Any information I can get on how to fix it would be appreciated. |
11-06-2012, 12:19 PM | #2 |
Sigil Developer
Posts: 7,675
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
There are many broken epubs out there (especially from B&N)! These epubs do NOT follow the zip or epub specifications. Epubs are supposed to be zip files. One form of breakage is to use garbage chars or full utf-16 unicode in the zip central directory filenames and then set the flag that indicates the names are utf-8 encoded. Another form of breakage is to not have the zip central directory filename match the the local filename and most zip access programs use the broken central directory name over the local name to prevent security attacks. This completely breaks the python standard library for accessing zips (zipfile.py). The only way around this is to create your own zipfile.py and look for and catch central filename decoding errors to work around this nonsense. If you are desperate, we can post for you an ePubFixer program (that requires you to have Python 2 installed with Tk widgets (see ActiveState Active Python 2.7 if on Windows, Macs and Linux are all set to go) that will read in the broken epub and write out a fixed epub, that should then work with calibre properly. The long term solution is for calibre to implement its own zipfile.py code (if it does not do that already) and handle the special case of improper utf-8 flags being set on garbage central directory filenames. The better solution if this is a B&N epub, is to send the ebook back and request an epub that actually meets the epub specification! Hope this helps, KevinH |
11-06-2012, 12:37 PM | #3 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
Google?
If it is a book I bought off of Google Play, can I send it back and ask that they provide a new one?
Is the only way to find out to try? And thank you for the quick reply. I'm a little peeved at all this, so unknowingly you've made my day much better by being johnny on the spot. Also, just to tell you a little about the process I'm using to get these, I'm downloading the ACSM files off of GooglePlay and then using Adobe Digital Editions to find the file path to the epub on my computer. Then adding those to the Calibre library. Last edited by paul.westland; 11-06-2012 at 12:43 PM. |
11-06-2012, 12:55 PM | #4 | |
Sigil Developer
Posts: 7,675
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
You are right, when given an ACSM file, you give that file to a properly registered Adobe Digital Editions program and it will properly download the epub adding the correct rights.xml file to allow it to be read. To verify it works, please open the file in ADE and verify you can read the file. If you can, there still might be a problem with the epub but it would be hard to argue that since it can be read in the program it was designed to be read in. If it is not readable in Adobe Digital Editions, then you should send it back and ask for a working version. It seems both ADE and B&N ebooks use their own zip access routines that simply walk the local directory extracting files with local filenames and basically ignores the central directory of the zip (which is against all of the security rules but ...). Hope this helps, Kevin Quote:
|
|
11-06-2012, 03:55 PM | #5 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
I can open it in Adobe, but when I try to open the epub in Calibre it tells me there is an invalid startbyte. Odd?
|
11-06-2012, 04:12 PM | #6 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
If I just try to open the book in calibre, without trying to convert it, I get this error.
calibre, version 0.9.5 ERROR: Could not open ebook: invalid start byte Traceback (most recent call last): File "site-packages\calibre\gui2\viewer\main.py", line 40, in run File "threading.py", line 504, in run File "site-packages\calibre\ebooks\oeb\iterator\book.py", line 99, in __enter__ File "site-packages\calibre\customize\conversion.py", line 239, in __call__ File "site-packages\calibre\ebooks\conversion\plugins\epub_in put.py", line 153, in convert File "site-packages\calibre\utils\zipfile.py", line 751, in __init__ File "site-packages\calibre\utils\zipfile.py", line 786, in _GetContents File "site-packages\calibre\utils\zipfile.py", line 847, in _RealGetContents File "site-packages\calibre\utils\zipfile.py", line 388, in _decodeFilename File "encodings\utf_8.py", line 16, in decode UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 19: invalid start byte |
11-06-2012, 04:15 PM | #7 |
Wizard
Posts: 2,251
Karma: 3720310
Join Date: Jan 2009
Location: USA
Device: Kindle, iPad (not used much for reading)
|
If you're opening it in ADE, it has DRM, which means that it is encrypted. Calibre won't be able to open an encrypted book.
|
11-06-2012, 04:18 PM | #8 |
Junior Member
Posts: 5
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
There are ways around that, I hear, and those ways may be utilized.
|
11-06-2012, 11:13 PM | #9 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
calibre does use its own modified version of zipfile.py. You are welcome to submit a patch against it for this issue, if you have epubs that have the issue. Note that to properly solve this, you will not only have to ignore the centrral directory but also correctly decode the local names to unicode. This is because, on windows calibre has to use unicode filenames to avoid encoding issues in the filesystem.
|
11-07-2012, 12:40 AM | #10 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
FYI: Using
zip -FF bad.epub --out fixed.epub should fix the central directory issue. |
11-07-2012, 06:53 AM | #11 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
11-07-2012, 10:50 AM | #12 |
Groupie
Posts: 191
Karma: 574940
Join Date: Jul 2012
Device: Kobo Touch, Kobo Glo, Kobo Arc(32GB), Kobo Arc 7HD, Kobo Glo HD,NookHD
|
I just bought a Google Play book this week and I'm having the same exact problem with the downloaded epub - first time I EVER had a problem importing any epub into Calibre, I might add, and this particular book is about my 12th or 13th Google Play epub.
|
11-11-2012, 02:04 PM | #13 | |
Groupie
Posts: 191
Karma: 574940
Join Date: Jul 2012
Device: Kobo Touch, Kobo Glo, Kobo Arc(32GB), Kobo Arc 7HD, Kobo Glo HD,NookHD
|
Quote:
Attached is a partial listing of the troublesome epub TOC, thanks to 7Zip. Interestingly, this book traveled as is without any tweaking from my ADE2.0 installation to my Adobe-authorized Kobo Touch without so much as a hiccup. A new speedbump from publishers, perhaps?? Last edited by oj829; 11-11-2012 at 02:20 PM. |
|
11-11-2012, 02:11 PM | #14 |
Groupie
Posts: 191
Karma: 574940
Join Date: Jul 2012
Device: Kobo Touch, Kobo Glo, Kobo Arc(32GB), Kobo Arc 7HD, Kobo Glo HD,NookHD
|
Incidentally, while 7Zip willingly deleted the file with the corrupt name from the archive, it wouldn't extract it (I thought I could fix the name and put it back).
7Zip bombed out with: 'Unsupported compression method for 'OEBPS\OEBPS\Images\Acit_9780767[etc.] |
11-11-2012, 02:16 PM | #15 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If 0.9.6 did not work with that file open a bug report and attach the file, I'd be interested in taking a look at it.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
'utf8' codec can't decode byte 0xd4 | anthonyliu | Calibre | 0 | 10-09-2012 12:50 AM |
Kindle Collections - utf8 invalid continuation byte | prometheus44 | Plugins | 3 | 12-16-2011 07:22 PM |
invalid library ... UnicodeDecodeError: 'utf8' codec can't ... | AhShoo5n | Calibre | 12 | 08-23-2011 12:53 PM |
Malformed byte sequence: Invalid byte 2 of 3-byte UTF-8 sequence. Check encoding | digireads | ePub | 3 | 04-26-2011 03:07 AM |
'utf8' codec can't decode bytes error (HTML to EPUB conversion) | gsz | Calibre | 10 | 10-26-2009 06:29 PM |