Quote:
Originally Posted by kovidgoyal
@BR: Attach one of these docx files that show a size increase of the epub comapred to the docx. I'm guessing the size increase is because of the generated cover, but it would be helpful to have a sample to be sure.
|
I think you're right - I don't keep covers in epubs. When I did the first conversion with the new DOCX handler I think I forgot to do a Modify to delete the cover.
The difference I initially noticed was all the HTML span tags, and that lead me to look at file sizes, added 2+2 and got 5. The compression reduces all the
<span class="calibre2"></span> and similar pairs to a fraction of actual size - I should have realised that.
I just did the conversions on that same document again - taking care to remove the cover on BOTH epubs.
The built handler produced an epub of 67,506 bytes, a smidgin larger than than the DOCX_Input epub (66,458 bytes). And in terms of disk space, they're exactly the same (69,632 bytes).
Attachment shows the difference in HTML files size (I moved HTML files into folders) - lots of markup I don't really need.
If I had a choice between reducing the markup from DOCX conversions, and having an option in EPUB Output to NOT include the cover then I would choose the latter.
I run Modify after every conversion, I only use Tweak Book occasionally, and if it's 'too hard' I can use Sigil, and if that's too hard I can go back to Word.
BR