MobileRead Forums - View Single Post - Conversion problems (pdf -> mobi)

PirateNL · 03-05-2011, 07:56 PM

Hi all,

I recently bought the Amazon Kindle. I want to read some PDF's on it and for what I know the easiest way is to convert them to MOBI files using Calibre, excuse me if I'm wrong.

So I went and install the program, installation went just fine. Imported a testpdf. All fine.

But i'm having all kind of problems with the conversion atm

Here's are some details of the error:

Code:

calibre, version 0.7.48
Convert book 1 of 1 (testpdf)
Resolved conversion options
calibre version: 0.7.48
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': 'c:\\users\\user\\appdata\\local\\temp\\calibre_0.7.48_tmp_9dt6ln\\calibre_0.7.48_nb5eiu.jpeg',
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'enable_heuristics': False,
 'extra_css': None,
 'fix_indents': True,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x05A71F70>,
 'insert_blank_line': False,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'new_pdf_engine': False,
 'no_chapters_in_toc': False,
 'no_images': False,
 'no_inline_navbars': False,
 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x05A772B0>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'prefer_metadata_cover': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': 'c:\\users\\user\\appdata\\local\\temp\\calibre_0.7.48_tmp_9dt6ln\\calibre_0.7.48_2zr_ge.opf',
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': u'',
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'unwrap_factor': 0.45,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: PDF Input running
on C:\Users\user\Calibre Library\mytest.pdf
Converting file to html...
pdftohtml log:

Retrieving document metadata...
Generating manifest...
Rendering manifest...
Parsing all content...
Parsing index.html ...
Initial parse failed:
Traceback (most recent call last):
  File "site-packages\calibre\ebooks\oeb\base.py", line 881, in first_pass
  File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48634)
  File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:72245)
  File "parser.pxi", line 1417, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:71041)
  File "parser.pxi", line 898, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:67581)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:64257)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:65178)
  File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64521)
XMLSyntaxError: Opening and ending tag mismatch: META line 6 and head, line 7, column 8

Parsing file 'index.html' as HTML
Forcing index.html into XHTML namespace
Python function terminated unexpectedly
  Invalid IPv6 URL (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 103, in main
  File "site.py", line 85, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 110, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 913, in run
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 1044, in create_oebbook
  File "site-packages\calibre\ebooks\oeb\reader.py", line 72, in __call__
  File "site-packages\calibre\ebooks\oeb\reader.py", line 600, in _all_from_opf
  File "site-packages\calibre\ebooks\oeb\reader.py", line 250, in _manifest_from_opf
  File "site-packages\calibre\ebooks\oeb\reader.py", line 189, in _manifest_add_missing
  File "site-packages\calibre\ebooks\oeb\base.py", line 365, in urlnormalize
  File "urlparse.py", line 134, in urlparse
  File "urlparse.py", line 182, in urlsplit
ValueError: Invalid IPv6 URL

Hmm I dont understand, what does IPv6 has to do with converting a PDF?
Hopefully anyone can help me

Thanks so much.