Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-03-2011, 03:42 AM   #1
avggeek
Member
avggeek began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Jul 2011
Location: Singapore
Device: 2022 iPad Air, Kindle Paperwhite 2
Problems converting CHM to EPUB

Hello,

It's possible that the source file I'm using is borked, but since I'm terrible at regex I figured it would be best to ask if there is some other problem here.

When I try to convert a CHM file, I get a single page EPUB with some junk characters. This is the output log:

Code:
Opening CHM file
Extracting CHM to c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_knfnqa_chm2oeb
Found 0 section nodes
Language not specified
Title not specified
Building file list...
	Found files...
		 HTMLFile:0:a:c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_knfnqa_chm2oeb\001.html
Normalizing filename cases
Rewriting HTML links
Parsing 001.html ...
Initial parse failed:
Traceback (most recent call last):
  File "site-packages\calibre\ebooks\oeb\base.py", line 886, in first_pass
  File "lxml.etree.pyx", line 2743, in lxml.etree.fromstring (src/lxml/lxml.etree.c:52665)
  File "parser.pxi", line 1573, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:79932)
  File "parser.pxi", line 1445, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:78709)
  File "parser.pxi", line 920, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:75083)
  File "parser.pxi", line 564, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:71739)
  File "parser.pxi", line 645, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:72614)
  File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955)
XMLSyntaxError: input conversion failed due to input error, bytes 0xBB 0xDB 0x0E 0x00

Parsing file '001.html' as HTML
File '001.html' does not appear to be (X)HTML
File '001.html' appears to be a HTML fragment
Forcing 001.html into XHTML namespace
File '001.html' missing <head/> element
Merging user specified metadata...
Detecting structure...
Auto generated TOC with 0 entries.
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Parsing stylesheet.css ...
Found 1 items of level: p_1
Ignoring level p_1
Cleaning up manifest...
Trimming unused files from manifest...
Creating EPUB Output...
	Looking for large trees in 001.html...
	No large trees found
This EPUB file has no Table of Contents. Creating a default TOC
EPUB output written to c:\users\avggeek\appdata\local\temp\calibre_0.8.5_tmp_zyu6tp\calibre_0.8.5_xvuuek.epub
This is on calibre 0.8.5, running under Windows 7 64-bit.
avggeek is offline   Reply With Quote
Old 07-03-2011, 11:16 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
CHM files are often a total mess (seems to be a theme with all Microsoft derived formats). I would suggest you convert the chm to html, edit it and convert the html using calibre.
kovidgoyal is offline   Reply With Quote
Advert
Old 11-22-2011, 09:47 AM   #3
gnv15
Junior Member
gnv15 began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Nov 2011
Device: ipad 2
chm to html

hi kovid
i am a newbie with very little knowledge of chm, html etc. was trying to convert a large chm file (330 mb) to pdf but failed. came across the above post and converted it to html. now i am left with quite a few folders with several files in them. these files have a MS Word picture on them, clicking on them opens a web page. how can i regroup them in calibre as a single pdf file? please help
gnv15 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre on Ubuntu: not converting CHM to epub varelov Calibre 6 03-28-2011 11:25 AM
Problems converting .chm to .mobi (or .epub) Chris. Calibre 4 10-04-2010 08:47 PM
Problems converting LRF to EPUB rbur Calibre 2 06-21-2010 06:28 PM
Page cutoff by converting chm to epub on sony 505 Hein Calibre 5 09-18-2009 12:20 PM
Problems converting epub books AprilHare Calibre 12 08-11-2008 08:15 PM


All times are GMT -4. The time now is 11:39 AM.


MobileRead.com is a privately owned, operated and funded community.