Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 04-30-2012, 05:35 AM   #1
m4mmon
Member
m4mmon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2012
Device: PRS-T1
html to epub CLI conversion / html input

Hello,

I have searched a lot but no luck, possibly my queries are not relevant, so here is my question.

I use calibre CLI in order to convert an html file to epub. The options I used are derived from those found in logs when using calibre GUI, so I expect to have the same result.
But I am missing something, the result is my epub has sometimes unresolved links to footnotes and/or missing page breaks. I tried to fine tune options, but that did not change anything.

My html source is a filtered html document from Word.
If I try to directly convert it to epub with ebook-convert and "my" options, I have the problem previously described.

If I first import the html file in calibre, get the resulting zip file then ebook-convert it with the exact same options as above, this time I have the expected result, exactly the same that if I did the conversion with calibre GUI.

So my conclusion is I am missing something with the html input. When importing in calibre there is some processing on it, but I don't know how to replicate it with CLI since I cannot see any log related to the import.

I tried to ebook-convert from html to zip first, but the resulting zip is completely different than if imported into calibre.


Can someone provide any tip/information/link to appropriate documentation section/existing forum thread ?
Thank you.

Last edited by m4mmon; 04-30-2012 at 06:27 AM. Reason: better problem description
m4mmon is offline   Reply With Quote
Old 04-30-2012, 06:36 AM   #2
m4mmon
Member
m4mmon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2012
Device: PRS-T1
I have found the missing step. Before converting the html to epub, I need to perform an html to OEB conversion first... So this is a 2-step conversion as I had suspected:

Code:
ebook-convert.exe book.htm oeb
ebook-convert.exe oeb\book.htm book.epub %opts%
So my problem is solved. If I had not been so "desperate", I would not have tried the first step, since documentation is clear about it:
Quote:
Finally, if output_file has no extension, then it is treated as a directory and an “open ebook” (OEB) consisting of HTML files is written to that directory. These files are the files that would normally have been passed to the output plugin.
Whatever the documentation tells, performing those two steps gives a slightly different result than this:
Code:
ebook-convert.exe book.htm book.epub %opts%
For those who do not believe this does make a difference, I have attached an archive containing a sample document, the batch that performs the conversions, and the resulting epub files.
Maybe someone will point a mistake or something, but since I have exactly the same result as when using calibre GUI to perform my conversion, I think my problem is solved.
Attached Files
File Type: zip CLI_tests.zip (64.9 KB, 68 views)

Last edited by m4mmon; 05-05-2012 at 03:08 AM.
m4mmon is offline   Reply With Quote
 
Advertisement
Old 05-05-2012, 03:10 AM   #3
m4mmon
Member
m4mmon began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Mar 2012
Device: PRS-T1
Problem solved, please read previous message.
m4mmon is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HTML input plugin stripping text within toc tags in child html file nimblebooks Conversion 3 02-21-2012 04:24 PM
Converting Epub to HTML from CLI removes formatting drjonez Conversion 2 01-20-2012 01:07 PM
Problem with html -> Mobi conversion - html tags visible. khromov Calibre 9 08-06-2011 12:25 PM
html to epub - input issue jwalk Conversion 4 06-07-2011 04:10 PM
Conversion Help Please - HTML to ePub PocketGoddess Calibre 1 11-22-2010 03:01 PM


All times are GMT -4. The time now is 02:16 PM.


MobileRead.com is a privately owned, operated and funded community.