View Single Post
Old 06-12-2013, 03:19 AM   #10
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,779
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by kovidgoyal View Post
@BR: Attach one of these docx files that show a size increase of the epub comapred to the docx. I'm guessing the size increase is because of the generated cover, but it would be helpful to have a sample to be sure.
I think you're right - I don't keep covers in epubs. When I did the first conversion with the new DOCX handler I think I forgot to do a Modify to delete the cover.

The difference I initially noticed was all the HTML span tags, and that lead me to look at file sizes, added 2+2 and got 5. The compression reduces all the <span class="calibre2"></span> and similar pairs to a fraction of actual size - I should have realised that.

I just did the conversions on that same document again - taking care to remove the cover on BOTH epubs.

The built handler produced an epub of 67,506 bytes, a smidgin larger than than the DOCX_Input epub (66,458 bytes). And in terms of disk space, they're exactly the same (69,632 bytes).

Attachment shows the difference in HTML files size (I moved HTML files into folders) - lots of markup I don't really need.

If I had a choice between reducing the markup from DOCX conversions, and having an option in EPUB Output to NOT include the cover then I would choose the latter.

I run Modify after every conversion, I only use Tweak Book occasionally, and if it's 'too hard' I can use Sigil, and if that's too hard I can go back to Word.

BR
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	448
Size:	54.2 KB
ID:	106905  

Last edited by BetterRed; 06-12-2013 at 03:22 AM. Reason: forgot attachment
BetterRed is offline   Reply With Quote