05-25-2016, 06:14 AM | #1 |
Member
Posts: 12
Karma: 10
Join Date: Jun 2015
Device: Kobo H2O
|
Best Way To Convert MHT to EPub?
{deleted}
Last edited by ReddFour; 05-23-2020 at 05:39 AM. |
05-25-2016, 09:38 AM | #2 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Word can open MHT files. Try opening the file in Word, and then re-saving in DOCX format. Calibre should then be able to open and convert the DOCX to ePub.
|
05-25-2016, 09:53 AM | #3 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
You could also export directly to ePUB from Word with my add-in.
|
05-25-2016, 10:02 AM | #4 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
An even better solution!
|
05-25-2016, 12:28 PM | #5 |
Member
Posts: 12
Karma: 10
Join Date: Jun 2015
Device: Kobo H2O
|
I tried opening in Word and then Calibre converting the docx but the result was not satisfactory. For now I have decided to use EPubPress directly from the Chrome browser which does a pretty decent job and is far more automated.
Thanks for the help. |
05-25-2016, 12:53 PM | #6 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Why exactly must you use MHT?
Just save the web page itself, I think calibre will download the necessary remote resources when you import the HTML file. If not, well, then you can do what you should have done in the first place. Even a lousy browser like Internet Explorer can still save as "Web Page, complete", which calibre can certainly handle fine and which you did not in fact do! Instead you saved as a "Web Archive, Single File", which is a Microsoft-specific format and it is therefore completely unsurprising that calibre doesn't know how to handle it. ... calibre can import HTML files, and will automatically collect referenced resources and pack it into an HTMLZ ebook. You can have calibre set to auto-convert added books to another format -- e.g. EPUB. |
05-25-2016, 02:10 PM | #7 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
It's a lot easier to help if you ask the right question. It sounds as if the question that you really want answering is "How can I convert a web page to ePub?" rather than "How can I convert an MHT file to ePub?"
|
05-26-2016, 08:19 AM | #8 | |
Member
Posts: 12
Karma: 10
Join Date: Jun 2015
Device: Kobo H2O
|
Quote:
Also, on the tests I have done with just converting HTML, Calibre is completely messing it up. For example, during the conversion to HTML, the different background colours "leak" into areas they shouldn't. No, my original question was correct. See above. The ePubPress was just a test. I would rather not go through all my existing files re-doing them if I can. |
|
05-26-2016, 10:13 AM | #9 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
https://sourceforge.net/projects/mht2htm Quote:
I suspect even with all the conversions you would probably still have to go in and tweak/fix the CSS. |
||
05-26-2016, 10:39 AM | #10 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
As Tex2002ans said, calibre cannot automagically fix badly-coded HTML. In fact, by default it doesn't try to fix it at all, other than flattening CSS and e.g. normalizing font sizes and line spacing.
Colors are the fault of the input HTML, certainly. As for MHT, it may have been convenient for you in the past, when you wanted to save webpages for later viewing in Internet Explorer, but now you are learning why it isn't convenient as a general principle. MHT is Microsoft's custom internal solution. Fortunately, it appears someone else was equally bothered by that lack of foresight, and created the application Tex2002ans referenced above. Of course, it still won't save you from badly-coded webpages. ePubPress likely has a lot of code on their server devoted to cleaning up common badly-formed HTML, much like Pocket, Readability, Instapaper, etc. That is outside of calibre's intended functionality. |
05-26-2016, 12:32 PM | #11 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Microsoft devised it, but note that it is actually an open standard (RFC 2557) and many different browsers support it, either natively or through plugins.
|
05-27-2016, 03:59 AM | #12 | |
Member
Posts: 12
Karma: 10
Join Date: Jun 2015
Device: Kobo H2O
|
Quote:
I think I have decided to forget about all my mht files and just start again from scratch. I'll try out various browser plugins like EPubPress and dotEpub and see which does the best job. I tried Grabmybooks in Firefox but that messes all the formatting up. Are there any other recommended ones? I am particularly interested in one that does a good job with articles with code samples (mainly in C++ and C#). Most of the ones I've tried tend to mess up the formatting of code sample blocks. EPubPress is the best so far in this regard but not perfect. Last edited by ReddFour; 05-27-2016 at 05:49 AM. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
koboish: Script that convert your epub to a kepub.epub with the correct bookcover !! | the_m | Kobo Reader | 4 | 01-24-2013 10:01 PM |
Looking For MHT Input Conversion Plugin | FlooseMan Dave | Plugins | 4 | 03-30-2010 05:52 PM |
Can .mht files be converted? | Starfish | Sony Reader | 3 | 12-06-2009 09:03 AM |
IREX PLEASE: Implement .mht support! | harryE123 | iRex | 1 | 01-22-2009 10:25 AM |