12-03-2014, 07:26 AM | #1 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
HTMLZ output plugin?
If I recall correctly, user_none made the calibre converter for ePUB-->HTMLZ. Perhaps it can be quite easily converted into a Sigil output plugin?
|
12-03-2014, 09:55 PM | #2 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi Toxaris,
If you can describe the format, I am pretty sure I could create an output plugin for it. As I remember, it was a zip containing an index.html file which points to all of the other html files, and some sort of abbreviated opf called a metadata.opf or something along those lines? Or am I misremembering? Is the htmlz format documented in the MR wiki? Kevin |
12-03-2014, 10:30 PM | #3 |
null operator (he/him)
Posts: 20,579
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
12-04-2014, 08:59 AM | #4 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
It is basically one big XHTML file, but it goes beyond that. AFAIK it takes into inline stylesheets and potential duplication of stylenames. Then again, I could be mistaken...
|
12-05-2014, 11:49 AM | #5 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi Toxaris,
If you truly feel this type of output file is needed, I would be happy to take a stab at it when I get some free time. The code in Calibre that user-none wrote is very easy to follow. But as far as I know, nothing except for calibre even knows what a htmlz file is or how to play with it. Also in calibre there are at least 3 different ways that styles are handled depending on calibre preferences: 1. remove as many as possible and replace with simple html tags whereever possible and leave the rest as inline styles 2. convert all to styles in the head of the document, getting rid of any external css 3. convert all to external css, remove all in-line styles. All of these approaches are fraught with problems if separate html pages end up being merged into one big html file because as separate xhtml files they might easily import from different css stylesheets that have conflicting definitions. So it may not be a lossless process at all. But before I start on something like this, I would love to know what else uses htmlz and how? If it is only calibre, then simply loading it into calibre and running its htmlz conversion would probably be for the best. It would in fact be easy to write an output plugin to Sigil that hands the book off to calibre (if calibre is installed on your system) or that invokes calibre's nice ebook-convert tools from the command line to convert to anything that calibre supports. Perhaps a plugin like that would be more useful? Take care, KevinH |
12-05-2014, 02:48 PM | #6 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
It was just an idea actually. Just thinking about possible output plugins and I remembered that user_none created that one. I personally have no direct use for it, so don't put in too much effort.
|
12-05-2014, 06:24 PM | #7 |
null operator (he/him)
Posts: 20,579
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
12-05-2014, 07:58 PM | #8 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi BetterRed,
Happy to. I only have access to a Mac and I know where the calibre command-line tools get installed on that platform. But can you tell me where they are installed on Windows and/or Linux? Once I find those calibre command-line tools the rest is quite straightforward. Thanks, Kevin |
12-05-2014, 09:44 PM | #9 |
null operator (he/him)
Posts: 20,579
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Hi Kevin
On my Windows they're in C:\Program Files\Calibre2 (64bit) and C:\Program Files (x86)\Calibre2 (32bit) and C:\Program Files Portable\Calibre Portable 2.9\Calibre and D:\_Sandpit\Calibre Portable 1.48\Calibre I believe some calibre users install Windows portable on a cloud drive, eg Dropbox, and some will have it on a thumb drive. So... it could get a new location (drive letter) depending on which computer they're using. I suggest the location be configurable and have it persist. My Mint box is inaccessible (up at the farm), but I know its not in the 'standard location'. So... again I would suggest the location be configurable and have it persist; although I doubt I'll ever use Sigil on that box. That said, my thumbs up didn't indicate I was anxious for such a PI, just that your idea seemed more sensible than creating an Output PI for the 'proprietary' HTMLZ format, which is readily and freely available 'next door'. Someone would inevitably ask, "Oi, wot about Output PIs for PDF and RB!" BR |
12-05-2014, 09:54 PM | #10 |
Resident Curmudgeon
Posts: 74,027
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The way to do this would be...
Take all the XML and make them one file. Link the stylesheet(s) to this XML file. If there are images and/or fonts, they do with it. HTMLZ is just a ZIP container with .htmlz as the extension. Basically it's taking all the split XML and putting them together and packaging up all the other files that go with the eBook. |
12-05-2014, 10:15 PM | #11 |
Grand Sorcerer
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Yep. That's all.
|
12-06-2014, 02:24 AM | #12 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
How about if multiple stylesheets are used with overlapping class names? That would seriously mess up that simple straightforward approach... How about internal styles? There are still a lot sgc-x styles out there that may be different in each xhtml file...
|
01-06-2015, 07:39 AM | #13 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
|
I use the htmlz format all the time. Primarily because it does merge all the ebook pages into a single html file. I run tidy on that to clean it up. I know there could be cases where different css files and inline styling could conflict, but I haven't run into that yet. The nice thing about htmlz is that you get one big index.html file that you can then edit instead of all the individual page files. I change the (whatever) html to html5 usually, and eventually convert back to an epub with Sigil. However, if Sigil could create an htmlz file, I would use that, because of one very important issue with Calibre's conversion: it changes all the photo names each time it converts! So, once I get an htmlz file, I stay with it and don't let Calibre touch it again, so the photo names are left alone.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Conversion Output] KePub Output Plugin | jgoguen | Plugins | 551 | 07-18-2023 06:22 AM |
HTMLZ Output | Mamaijee | Calibre | 2 | 03-05-2013 05:15 PM |
HTMLZ Output | Mamaijee | Conversion | 2 | 11-06-2011 04:44 PM |
HTMLZ Output | Mamaijee | Conversion | 1 | 06-23-2011 07:00 PM |
HTMLZ - Single HTML File Output | user_none | Calibre | 22 | 05-19-2011 02:33 AM |