07-03-2011, 03:42 AM | #1 |
Member
Posts: 16
Karma: 10
Join Date: Jul 2011
Location: Singapore
Device: 2022 iPad Air, Kindle Paperwhite 2
|
Problems converting CHM to EPUB
Hello,
It's possible that the source file I'm using is borked, but since I'm terrible at regex I figured it would be best to ask if there is some other problem here. When I try to convert a CHM file, I get a single page EPUB with some junk characters. This is the output log: Code:
Opening CHM file Extracting CHM to c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_knfnqa_chm2oeb Found 0 section nodes Language not specified Title not specified Building file list... Found files... HTMLFile:0:a:c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_knfnqa_chm2oeb\001.html Normalizing filename cases Rewriting HTML links Parsing 001.html ... Initial parse failed: Traceback (most recent call last): File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665) File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932) File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709) File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083) File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739) File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614) File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955) XMLSyntaxError: input conversion failed due to input error, bytes 0xBB 0xDB 0x0E 0x00 Parsing file '001.html' as HTML File '001.html' does not appear to be (X)HTML File '001.html' appears to be a HTML fragment Forcing 001.html into XHTML namespace File '001.html' missing <head/> element Merging user specified metadata... Detecting structure... Auto generated TOC with 0 entries. Flattening CSS and remapping font sizes... Source base font size is 12.00000pt Removing fake margins... Parsing stylesheet.css ... Found 1 items of level: p_1 Ignoring level p_1 Cleaning up manifest... Trimming unused files from manifest... Creating EPUB Output... Looking for large trees in 001.html... No large trees found This EPUB file has no Table of Contents. Creating a default TOC EPUB output written to c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_xvuuek.epub |
07-03-2011, 11:16 AM | #2 |
creator of calibre
Posts: 44,343
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
CHM files are often a total mess (seems to be a theme with all Microsoft derived formats). I would suggest you convert the chm to html, edit it and convert the html using calibre.
|
Advert | |
|
11-22-2011, 09:47 AM | #3 |
Junior Member
Posts: 1
Karma: 10
Join Date: Nov 2011
Device: ipad 2
|
chm to html
hi kovid
i am a newbie with very little knowledge of chm, html etc. was trying to convert a large chm file (330 mb) to pdf but failed. came across the above post and converted it to html. now i am left with quite a few folders with several files in them. these files have a MS Word picture on them, clicking on them opens a web page. how can i regroup them in calibre as a single pdf file? please help |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calibre on Ubuntu: not converting CHM to epub | varelov | Calibre | 6 | 03-28-2011 11:25 AM |
Problems converting .chm to .mobi (or .epub) | Chris. | Calibre | 4 | 10-04-2010 08:47 PM |
Problems converting LRF to EPUB | rbur | Calibre | 2 | 06-21-2010 06:28 PM |
Page cutoff by converting chm to epub on sony 505 | Hein | Calibre | 5 | 09-18-2009 12:20 PM |
Problems converting epub books | AprilHare | Calibre | 12 | 08-11-2008 08:15 PM |