![]() |
#1 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Help! New to Calibre
Hi, I am new to Calibre and I found this forum - hoping someone can help! I am trying to convert an .htm file and I keep having problems. I did submit a ticket, but I thought I would come here.
Here is the error: Code:
ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (SuccessfulBlog) Convert book 1 of 1 (SuccessfulBlog) Resolved conversion options calibre version: 0.7.40 {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'breadth_first': False, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'disable_font_rescaling': False, 'dont_package': False, 'dont_split_on_page_breaks': False, 'epub_flatten': False, 'extra_css': None, 'extract_to': None, 'flow_size': 260, 'font_size_mapping': None, 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)', 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)', 'html_unwrap_factor': 0.4, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x1079fd950>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'max_levels': 5, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'no_chapters_in_toc': False, 'no_default_epub_cover': False, 'no_inline_navbars': False, 'no_svg_cover': False, 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x1079fdf10>, 'page_breaks_before': u"//*[name()='h1' or name()='h2']", 'prefer_metadata_cover': False, 'preprocess_html': False, 'preserve_cover_aspect_ratio': False, 'pretty_print': True, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': '/var/folders/7D/7Dkn3HjzHa8yKAMRN3BviU+++TI/-Tmp-/calibre_0.7.40_tmp_CPzBYN/calibre_0.7.40_DfX_yv.opf', 'remove_first_image': False, 'remove_footer': False, 'remove_header': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'series': None, 'series_index': None, 'smarten_punctuation': False, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'use_auto_toc': False, 'verbose': 2} WARNING Property: Unknown Property name. [6:2: panose-1] WARNING Property: Unknown Property name. [7:2: mso-font-charset] WARNING Property: Unknown Property name. [8:2: mso-generic-font-family] WARNING Property: Unknown Property name. [9:2: mso-font-pitch] WARNING Property: Unknown Property name. [10:2: mso-font-signature] WARNING Property: Unknown Property name. [13:3: mso-style-parent] WARNING Property: Unknown Property name. [16:2: mso-pagination] WARNING Property: Unknown Property name. [19:2: mso-fareast-font-family] WARNING Property: Unknown Property name. [20:2: mso-bidi-font-family] WARNING Property: Unknown Property name. [22:3: mso-style-link] WARNING Property: Unknown Property name. [25:2: mso-pagination] WARNING Property: Unknown Property name. [26:2: tab-stops] WARNING Property: Unknown Property name. [29:2: mso-fareast-font-family] WARNING Property: Unknown Property name. [30:2: mso-bidi-font-family] WARNING Property: Unknown Property name. [32:3: mso-style-link] WARNING Property: Unknown Property name. [35:2: mso-pagination] WARNING Property: Unknown Property name. [36:2: tab-stops] WARNING Property: Unknown Property name. [39:2: mso-fareast-font-family] WARNING Property: Unknown Property name. [40:2: mso-bidi-font-family] WARNING Property: Unknown Property name. [42:3: mso-style-name] WARNING Property: Unknown Property name. [43:2: mso-style-locked] WARNING Property: Unknown Property name. [44:2: mso-style-link] WARNING Property: Unknown Property name. [45:2: mso-ansi-font-size] WARNING Property: Unknown Property name. [46:2: mso-bidi-font-size] WARNING Property: Unknown Property name. [48:3: mso-style-name] WARNING Property: Unknown Property name. [49:2: mso-style-locked] WARNING Property: Unknown Property name. [50:2: mso-style-link] WARNING Property: Unknown Property name. [51:2: mso-ansi-font-size] WARNING Property: Unknown Property name. [52:2: mso-bidi-font-size] WARNING Property: Unknown Property name. [56:2: mso-header-margin] WARNING Property: Unknown Property name. [57:2: mso-footer-margin] WARNING Property: Unknown Property name. [58:2: mso-footer] WARNING Property: Unknown Property name. [59:2: mso-paper-source] WARNING CSSStylesheet: Unknown @rule found. [63:1: @list] WARNING CSSStylesheet: Unknown @rule found. [67:1: @list] WARNING CSSStylesheet: Unknown @rule found. [88:1: @list] WARNING CSSStylesheet: Unknown @rule found. [109:1: @list] WARNING CSSStylesheet: Unknown @rule found. [130:1: @list] WARNING CSSStylesheet: Unknown @rule found. [151:1: @list] WARNING CSSStylesheet: Unknown @rule found. [172:1: @list] WARNING CSSStylesheet: Unknown @rule found. [193:1: @list] WARNING CSSStylesheet: Unknown @rule found. [214:1: @list] WARNING CSSStylesheet: Unknown @rule found. [235:1: @list] Python function terminated unexpectedly: All strings must be XML compatible: Unicode or ASCII, no NULL bytes InputFormatPlugin: HTML Input running on /Users/Julie/Calibre Library/Unknown/SuccessfulBlog (11)/SuccessfulBlog - Unknown.htm Language not specified Creator not specified Building file list... Found files... HTMLFile:0:a:/Users/Julie/Calibre Library/Unknown/SuccessfulBlog (11)/SuccessfulBlog - Unknown.htm Normalizing filename cases Rewriting HTML links Parsing SuccessfulBlog%20-%20Unknown.htm ... Initial parse failed: Traceback (most recent call last): File "site-packages/calibre/ebooks/oeb/base.py", line 857, in first_pass File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48634) File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:72245) File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:71041) File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67581) File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257) File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178) File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521) XMLSyntaxError: AttValue: " or ' expected, line 8, column 11 Parsing file 'SuccessfulBlog%20-%20Unknown.htm' as HTML Forcing SuccessfulBlog%20-%20Unknown.htm into XHTML namespace File 'SuccessfulBlog%20-%20Unknown.htm' missing <title/> element Traceback (most recent call last): File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 147, in main return run_entry_point() File "/Applications/calibre.app/Contents/Resources/Python/lib/python2.7/site.py", line 116, in run_entry_point return getattr(pmod, func)() File "site-packages/calibre/utils/ipc/worker.py", line 107, in main File "site-packages/calibre/gui2/convert/gui_conversion.py", line 24, in gui_convert File "site-packages/calibre/ebooks/conversion/plumber.py", line 855, in run File "site-packages/calibre/customize/conversion.py", line 216, in __call__ File "site-packages/calibre/ebooks/html/input.py", line 295, in convert File "site-packages/calibre/ebooks/html/input.py", line 374, in create_oebbook File "site-packages/calibre/ebooks/oeb/base.py", line 224, in rewrite_links File "lxml.etree.pyx", line 821, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:33308) File "apihelpers.pxi", line 646, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:15287) File "apihelpers.pxi", line 1295, in lxml.etree._utf8 (src/lxml/lxml.etree.c:20212) ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes Thanks! Julie |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
MS Office notoriously produces lots of cruft in HTML documents. Try saving it as a "filtered web page". Also, try enabling the preprocessing option in the structure detection part of the conversion settings.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Okay, I am not sure how to do that - LOL
I did enable the preprocessing option though! ![]() I am using a Mac version of Office if that matters. I think you may be right about the problem being with .doc. What would you say is the best way to convert. I thought of text, but I have hyperlinks in my doc. Thanks!! Great forum BTW.. I am also an avid Kindle reader - so I think in the future I can use Calibre for that as well! |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
.....
|
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
|
On my Windows copy of Word (admittedly pretty old), its under File - SaveAs - Save as Type - Webpage, filtered
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Edited to add: You could also try RTF, Calibre supports that as well. I don't know if it allows for links, though. Quote:
![]() |
||
![]() |
![]() |
![]() |
#7 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Thanks! I thought the error would be the easiest than trying to explain. I understand the whole "I am having problems" with no explanation thing works.
![]() Thanks for your help. I think I may have to think this about this one. There is no option in Mac Word for saving that way - only .htm. I did manage to change a few things under option, it converted it, but in Chinese letters! Strange! I will let you know if I figure it out, in case others have the same problems. UPDATE: Sorry to be a pain. I am getting closer - the error is smaller. I changed it to a .doc and checked off "Save only display information in HTML" Here is what i have now: Code:
ERROR: ERROR: Unhandled exception: <b>TypeError</b>:'NoneType' object is not subscriptable Traceback (most recent call last): File "site-packages/calibre/gui2/__init__.py", line 306, in dispatch File "site-packages/calibre/gui2/actions/convert.py", line 175, in book_converted File "site-packages/calibre/library/database2.py", line 945, in add_format File "site-packages/calibre/library/database2.py", line 383, in path TypeError: 'NoneType' object is not subscriptable Last edited by JulieMack; 01-21-2011 at 07:14 PM. |
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
That sounds like you messed with the codepage settings. If you're just clicking things at (mostly) random, try having a look at the manual before you do.
|
![]() |
![]() |
![]() |
#9 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,362
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
I recommend that you run Check Database Integrity, then check library. Both are under 'library maintenance' on the dropdown menu for the library toolbar button. By any chance to you use two different operating systems to access the same calibre library? You can get problems similar to the one you are seeing if both Windows/Mac and Linux are used on the same library, because the first two are case-insensitive but linux is not. |
|
![]() |
![]() |
![]() |
#10 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Quote:
Here is what I did to fix it (mostly). I finally found the best solution to get hyperlinks correctly moved over - Open Office. From Open Office, I can save as .pdf. THEN, I uploaded it to Calibre. It converted pretty good. The hyperlinks work, although the anchor text is a little off. But, for now, I think this may work best. Thanks everyone for your help. Now, I can move on the rest of the forum! ![]() |
|
![]() |
![]() |
![]() |
#11 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Try to avoid PDF as a source format for conversion. The native OpenOffice format should work as well. PDF is really only suitable for print prestage and such things.
|
![]() |
![]() |
![]() |
#12 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Thanks for all your help - I appreciate since I an new to Calibre. In your opinion, what is the best way to convert a doc with hyperlinks that work if not PDF?
|
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Like I said, try the native OpenOffice format (I believe it's called ODT), Calibre supports that as well. The problem with PDF is that it saves fixed formatting very well, like in a page layout for print, but it often seriously falters when getting converted into reflowable text, like ebooks.
|
![]() |
![]() |
![]() |
#14 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 29
Karma: 520356
Join Date: Jan 2011
Device: Kindle
|
Quote:
![]() Thanks again - I will write these steps down to share with others once it all works! Okay, I did the .odt, but when I convert it, it adds all the <Document Properties> at the front of the ebook. But, it does convert the hyperlinks well! Here's just part of it... <o ![]() <o:Template>Normal.dotm</o:Template> <o:Revision>0</o:Revision> <o:TotalTime>0</o:TotalTime> <o:Pages>1</o:Pages> <o:Words>2311</o:Words> <o:Characters>11093</o:Characters> Man, so close!! |
|
![]() |
![]() |
![]() |
#15 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
That's kinda strange. Are those things at the front of the ODT, then?
By the way, if the PDF route works well for you, then by all means, use it. I was just trying to give the general hint that most of the time, PDF converts rather bad. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How install Calibre & Calibre Library where I choose? | akmatov | Calibre | 3 | 01-17-2011 08:36 PM |
Kindle and Calibre user with problem importing large library into Calibre | pleabargain | Calibre | 1 | 12-07-2010 10:19 AM |
Calibre can't work with prior Calibre library. | PassedPawn | Calibre | 4 | 12-03-2010 07:15 AM |
Calibre metadata.calibre not allowing updates | Chuckels550 | Calibre | 10 | 08-09-2010 05:12 PM |
calibre command line utilities and calibre defaults | astrodad | Calibre | 2 | 08-07-2008 03:27 PM |