![]() |
#1 |
Information Acquirer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 436
Karma: 4265156
Join Date: Sep 2010
Location: Latvia, Rigas Rajons
Device: Kindle 3 International, Pocketbook Color
|
Another HTML conversion error
Hello,
I've read through a few (HTML) conversion threads, but can't find why I get all the errors. I'm trying to convert a web document fetched here to .mobi. My FireFox stores the html as a .htm with a folder containing all "extra" files. I've imported the .htm and manually copied the "additional files" folder to where Calibre stores the imported .htm. I've tried both with and without heuristics, and below is the error message without the heuristics: calibre, version 0.7.52 ERROR: Feil ved konverteringen: <b>Feilet</b>: Convert book 1 of 1 (Speaker Wire - A History) Convert book 1 of 1 (Speaker Wire - A History) Resolved conversion options calibre version: 0.7.52 {'asciiize': True, 'author_sort': None, 'authors': None, 'base_font_size': 0.0, 'book_producer': None, 'breadth_first': False, 'change_justification': u'original', 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\\s+', 'i')) or @class = 'chapter']", 'chapter_mark': u'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'dehyphenate': True, 'delete_blank_paragraphs': True, 'disable_font_rescaling': True, 'dont_compress': False, 'dont_package': False, 'enable_heuristics': False, 'extra_css': None, 'fix_indents': True, 'font_size_mapping': None, 'format_scene_breaks': True, 'html_unwrap_factor': 0.4, 'input_encoding': u'cp1252', 'input_profile': <calibre.customize.profiles.InputProfile object at 0x04FF6FB0>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'italicize_common_cases': True, 'keep_ligatures': False, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0.0, 'linearize_tables': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'markup_chapter_headings': True, 'max_levels': 5, 'max_toc_links': 50, 'minimum_line_height': 120.0, 'mobi_ignore_margins': False, 'no_chapters_in_toc': False, 'no_inline_navbars': True, 'no_inline_toc': False, 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x050272F0>, 'page_breaks_before': u"//*[name()='h1' or name()='h2']", 'personal_doc': u'[PDOC]', 'prefer_author_sort': False, 'prefer_metadata_cover': False, 'pretty_print': False, 'pubdate': None, 'publisher': None, 'rating': None, 'read_metadata_from_opf': 'c:\\users\\esben\\appdata\\local\\temp\\calibre_0 .7.52_tmp_xae9af\\calibre_0.7.52_irqbce.opf', 'remove_fake_margins': True, 'remove_first_image': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'renumber_headings': True, 'replace_scene_breaks': u'', 'rescale_images': False, 'series': None, 'series_index': None, 'smarten_punctuation': False, 'sr1_replace': None, 'sr1_search': None, 'sr2_replace': None, 'sr2_search': None, 'sr3_replace': None, 'sr3_search': None, 'tags': None, 'timestamp': None, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'toc_title': None, 'unwrap_lines': True, 'use_auto_toc': False, 'verbose': 2} InputFormatPlugin: HTML Input running on D:\Dokumenter\Calibre_Bibliotek\Russell_ Roger\Speaker Wire - A History (40)\Speaker Wire - A History - Russell_ Roger.htm Language not specified Building file list... Found files... HTMLFile:0:a ![]() Normalizing filename cases Rewriting HTML links Parsing Speaker%20Wire%20-%20A%20History%20-%20Russell_%20Roger.htm ... Initial parse failed: Traceback (most recent call last): File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48634) File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:72245) File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:71041) File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67581) File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257) File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178) File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521) XMLSyntaxError: Opening and ending tag mismatch: meta line 57 and head, line 58, column 8 Parsing file 'Speaker%20Wire%20-%20A%20History%20-%20Russell_%20Roger.htm' as HTML Forcing Speaker%20Wire%20-%20A%20History%20-%20Russell_%20Roger.htm into XHTML namespace Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\image19.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\colorbar.gif Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\wirebusters3.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\impedance.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\response6.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\hearing3.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\wire8.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\wire9.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\wire4.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\wire5.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\monsterb.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\monsterc.jpg Added d:\dokumenter\calibre_bibliotek\russell_ roger\speaker wire - a history (40)\filer_for_wire\thin%20wire.jpg Python function terminated unexpectedly [Errno 2] No such file or directory: u'd:\\dokumenter\\calibre_bibliotek\\russell_ roger\\speaker wire - a history (40)\\filer_for_wire\\thin wire.jpg' (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 110, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 915, in run File "site-packages\calibre\customize\conversion.py", line 204, in __call__ File "site-packages\calibre\ebooks\html\input.py", line 294, in convert File "site-packages\calibre\ebooks\html\input.py", line 373, in create_oebbook File "site-packages\calibre\ebooks\oeb\base.py", line 185, in rewrite_links File "site-packages\calibre\ebooks\html\input.py", line 468, in resource_adder File "site-packages\calibre\ebooks\oeb\base.py", line 1148, in fget File "site-packages\calibre\ebooks\oeb\base.py", line 472, in read IOError: [Errno 2] No such file or directory: u'd:\\dokumenter\\calibre_bibliotek\\russell_ roger\\speaker wire - a history (40)\\filer_for_wire\\thin wire.jpg' The converted .pdf (from file -> print menu in browser) does however convert to .mobi, but the result is not 100% |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
It's failing because it says it can't find a file called 'thin wire.jpg'. If the file exist perhaps it doesn't like the name having a space in it.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Edit: Also, just dumping the associated files into the same directory wouldn't necessarily work- they have to be in the correct path as described by their links, since FF stores the associated content in an extra folder, it probably rewrote the links to point to the files inside that folder. |
|
![]() |
![]() |
![]() |
#4 |
Information Acquirer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 436
Karma: 4265156
Join Date: Sep 2010
Location: Latvia, Rigas Rajons
Device: Kindle 3 International, Pocketbook Color
|
@ldolse:
Thank you. The problem was that the stored .htm had made a link to "thin%2520wire.jpg" whereas the actual name inside the "stored files" folder were "thin%20wire.jpg". Thus it couldn't find it. Renaming to (and fixing link to) "thin-wire.jpg" worked out and a beautifully converted .mobi-file was created. ![]() My Kindle will be happy ![]() @Manichean: Thanks for your advice. I'm aware of this, that's why I also manually copied the folder with the linked files over. I also tried to .zip it and import, but with the same error message, so ldolse's reply was the cure this time. ![]() |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
So, just out of curiosity, there was no ZIP file in the folder? Or did you dump the image inside the ZIP?
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
I was wondering about the zip thing too - wasn't certain if the debug messages definitively showed zip or not.
Manually copying a folder into the book's directory isn't the recommended way to go, though apparently it worked for the OP in this case. Generally the recommended way to add an html book is to just grab the root html file (which in turn has href links to everything else) to Calibre's main window. Calibre will grab everything and create an appropriate zip for you. (you also need the html to zip plugin enabled, though this is default) If you created your own zip manually it probably wouldn't have worked, as Calibre places an OPF and some other items in the zip. |
![]() |
![]() |
![]() |
#7 |
Information Acquirer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 436
Karma: 4265156
Join Date: Sep 2010
Location: Latvia, Rigas Rajons
Device: Kindle 3 International, Pocketbook Color
|
When I go to "file -> save as" in the menu of FireFox, I get to name the main file and choose path.
The extension is default .htm and any additional files will be stored in a folder in the same path as the main .htm. See attached screenshot of the saved file(s). I chose the name Wire" in this case. When I imported the .htm, the belonging folder did not tag along, so I manually put it into the same place/folder as "Calibre" stored the imported file. This worked as a charm, and I guess, since it did work, I'll continue do the .htm(l) file import and convert the same way. I can always delet the .htm(l) with the related file folder if I want, after a successfull conversion ![]() |
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
It sounds like you have the 'html to zip' plugin disabled in the plugin preferences. By default that is enabled, and when you import html Calibre will grab all the referenced objects and stick them in a single zip file. If it's disabled then Calibre will behave as you're describing.
|
![]() |
![]() |
![]() |
#9 |
Information Acquirer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 436
Karma: 4265156
Join Date: Sep 2010
Location: Latvia, Rigas Rajons
Device: Kindle 3 International, Pocketbook Color
|
|
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
I just tested, and Calibre zips up .htm files in my installation. Could you check if the HTML to ZIP plugin is indeed enabled? If it is, there seems to be something odd at work here.
|
![]() |
![]() |
![]() |
#11 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
|
|
![]() |
![]() |
![]() |
#12 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,993
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
could this be a OS 'case' sensitivity problem?
|
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
|
![]() |
![]() |
![]() |
#14 |
Information Acquirer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 436
Karma: 4265156
Join Date: Sep 2010
Location: Latvia, Rigas Rajons
Device: Kindle 3 International, Pocketbook Color
|
I tried another webgrab (from wikibooks). Now it imported and made a .zip-file as you expected.
Seems that the first file/import I did failed d.t errors in the filepaths of the files containing spaces in their name. So, I'll consider problem solved. Thank you for your help and suggestions. ![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Conversion memory error (HTML->EPUB) | doremifaso | Calibre | 4 | 06-25-2010 10:56 PM |
conversion TO html | in_the_fade | Calibre | 4 | 04-29-2010 10:51 AM |
'utf8' codec can't decode bytes error (HTML to EPUB conversion) | gsz | Calibre | 10 | 10-26-2009 06:29 PM |
HTML Conversion Error | dedicated | Calibre | 12 | 12-18-2008 02:36 PM |
lrfviewer & reader error after html conversion | BrendenM | Calibre | 3 | 09-16-2008 11:40 AM |