03-15-2019, 01:27 AM | #1 |
Junior Member
Posts: 1
Karma: 10
Join Date: Mar 2019
Device: Calibre
|
Need help on converting Epub to HTML
Is there a way to convert EPUB to HTML without having some of the classes? Or CSS on it? Or only have it with plain HTML tags?
Or is there any other tool you recommend? Thanks in advance. |
03-15-2019, 06:02 PM | #2 |
Evangelist
Posts: 401
Karma: 1597305
Join Date: Mar 2010
Device: Ipod G4, MacOS 10.12, Calibre, Pocketbook Touch HD 3
|
I don't know if there's a "quick" way, but I would open in Sigil and create new HTML files from the entrails (using cut & paste). I'd then search & replace styles etc.
As I said, it's not a quick way. |
Advert | |
|
03-15-2019, 06:28 PM | #3 |
Well trained by Cats
Posts: 29,901
Karma: 55267620
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
If the book uses a CSS, simply un-link the files. Even though the code contains class=, there is no styling
if inline styles: REGEX should help removing style="(.+?)" |
03-15-2019, 08:40 PM | #4 |
Grand Sorcerer
Posts: 12,230
Karma: 74000000
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
I've not tried this myself but a conversion to markdown format should produce a document with very basic formatting, which on turn should concert to a very simplified html document.
You might also see what a tool called PanDoc might do. https://pandoc.org/ |
03-15-2019, 08:54 PM | #5 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
|
Advert | |
|
03-15-2019, 10:00 PM | #6 |
Addict
Posts: 389
Karma: 1638210
Join Date: May 2013
Location: Ontario, Canada
Device: Kindle KB, Oasis, Pop_Os!, Jutoh, Kobo Forma
|
You could simply open it in the Editor, delete any files you don't want in the result, then:
->Arrange the header of the top file as you desire, including removing CSS links if you like. ->Select all the text files you want, right-click, and merge them. ->Use some regex to get rid of all the IDs, attributes, spans, divs, and what-have-you. ->Another regex search to turn all the various <p class="whatever"> to simply <p>...or maybe two or three searches if you want several in your result. Same with <hn...> lines. (If you have a book based all on <div>s instead of <p>s, adjust accordingly.) That should leave you with one file, as simplified as you desire. Just export it. Given all the stuff in books, I doubt you'd ever automate this, but go through one and save your searches, and the next ones should literally take only minutes. I basically do this when faced with some ancient, amateur scanned book that looks like a ransom note. Then I re-format and split the single file into chapters of whatever. But the basic clean-up only takes a couple of minutes with saved searches. Last edited by retiredbiker; 03-15-2019 at 10:05 PM. |
Tags |
calibre, epubtohtml |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting from HTML to EPUB, no toc | rickhan | Conversion | 2 | 09-30-2017 12:26 AM |
Best Settings for Converting Epub to HTML | sjps | Conversion | 1 | 01-16-2015 04:34 PM |
Number of HTML converting to EPUB | HoushaSen | Conversion | 11 | 08-16-2011 07:49 AM |
Converting a problematic HTML to epub | redryder | Conversion | 1 | 05-30-2011 09:54 PM |
How much shall I pay you for converting HTML to ePUB? | vadimzn | ePub | 8 | 04-07-2011 01:46 AM |