Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 07-17-2014, 06:42 AM   #1
Joques
Addict
Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.
 
Posts: 294
Karma: 107414
Join Date: May 2013
Device: Kobo Glo
Bloated epub file sizes?

I am curious, and a quick search on the forum didn't yield an answer:

Why is the so little correlation between the length of a book and its file size? I am not talking about books containing lots of pictures here, just formatted text in a regular novel. One book of average length might weigh in at 400kB, while another, of similar length and also having no images in there, might be 2,5 MB.

What is it that bloats these file sizes so?
Joques is offline   Reply With Quote
Old 07-17-2014, 06:47 AM   #2
shalym
Wizard
shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.
 
shalym's Avatar
 
Posts: 3,058
Karma: 54671821
Join Date: Feb 2012
Location: New England
Device: PW 1, 2, 3, Voyage, Oasis 2 & 3, Fires, Aura HD, iPad
Quote:
Originally Posted by Joques View Post
I am curious, and a quick search on the forum didn't yield an answer:

Why is the so little correlation between the length of a book and its file size? I am not talking about books containing lots of pictures here, just formatted text in a regular novel. One book of average length might weigh in at 400kB, while another, of similar length and also having no images in there, might be 2,5 MB.

What is it that bloats these file sizes so?
I'm not sure, because I don't edit epubs, but I would say probably bad formatting? I would think that if the styles are declared for every page instead of just once for the whole book it would cause bloating. There could also be embedded fonts in the book.

I'm sure someone who DOES edit epubs will come along soon and either confirm what I said or ridicule me for not knowing what I'm talking about

Shari
shalym is offline   Reply With Quote
Old 07-17-2014, 06:56 AM   #3
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Please post in the correct forum. Moved to the ePub file format forum.
HarryT is offline   Reply With Quote
Old 07-17-2014, 07:01 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
You need to edit the book in an ePub editor such as Sigil or Calibre and see what's causing the bloat. It's probably caused by an over-proliferation of CSS styles.
HarryT is offline   Reply With Quote
Old 07-17-2014, 07:45 AM   #5
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
If there is a cover, than I should look at the cover size for sure. An average book with only text can not be 2.5MB. Using a different compression level will have some influence.
Bad formatting will have impact for sure, but for an average book to be 2.5MB it must be seriously bad formatted...
Toxaris is offline   Reply With Quote
Old 07-17-2014, 08:15 AM   #6
Doonge
Connoisseur
Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.Doonge ought to be getting tired of karma fortunes by now.
 
Posts: 80
Karma: 1184732
Join Date: Nov 2013
Device: Kobo Glo
Embedded font bloat books.
Doonge is offline   Reply With Quote
Old 07-17-2014, 08:18 AM   #7
doubleshuffle
Unicycle Daredevil
doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.
 
doubleshuffle's Avatar
 
Posts: 13,944
Karma: 185432100
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
Yep. Especially CharisSIL, which is embedded in many retail epubs.
doubleshuffle is offline   Reply With Quote
Old 07-17-2014, 08:21 AM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,756
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by doubleshuffle View Post
Yep. Especially CharisSIL, which is embedded in many retail epubs.
But using Calibre to subset embedded fonts really does cut down the file size. Also, if the cover is either very large and/or not all that compressed, it can be a large file size. So reducing the cover image size and recompressing can help.
JSWolf is offline   Reply With Quote
Old 07-17-2014, 08:29 AM   #9
doubleshuffle
Unicycle Daredevil
doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.doubleshuffle ought to be getting tired of karma fortunes by now.
 
doubleshuffle's Avatar
 
Posts: 13,944
Karma: 185432100
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
Quote:
Originally Posted by JSWolf View Post
But using Calibre to subset embedded fonts really does cut down the file size.
True. Though I usually throw it out completely. I've got CharisSIL sideloaded on my reader anyway, no need to have it in lots of different books over and over again.
doubleshuffle is offline   Reply With Quote
Old 07-17-2014, 08:37 AM   #10
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
  • Overly large/badly compressed covers/images.
  • A "phantom cover"
    • I have come across a few purchased books where there is a useless second cover inside of the book, which isn't referenced or anything, just takes up space.
  • When the EPUB was packaged, they didn't compress it, or didn't compress to the maximum level (this one just baffles me).
    • I have come across a few purchased books which had zero compression on them.
  • Bad code/formatting could bloat the EPUB (oh the horrors I have seen).
  • Fonts that are not subsetted
  • Fonts that aren't used/referenced
    • For example, including a bold font when no bold is used, or a smallcaps font when no smallcaps are used.
Tex2002ans is offline   Reply With Quote
Old 07-17-2014, 09:16 AM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,756
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Tex2002ans View Post
  • Overly large/badly compressed covers/images.
  • A "phantom cover"
    • I have come across a few purchased books where there is a useless second cover inside of the book, which isn't referenced or anything, just takes up space.
  • When the EPUB was packaged, they didn't compress it, or didn't compress to the maximum level (this one just baffles me).
    • I have come across a few purchased books which had zero compression on them.
  • Bad code/formatting could bloat the EPUB (oh the horrors I have seen).
  • Fonts that are not subsetted
  • Fonts that aren't used/referenced
    • For example, including a bold font when no bold is used, or a smallcaps font when no smallcaps are used.
Modify ePub can get rid of unused images and subsetting gets rid of unused fonts and reduce the used fonts in size.

Thumbnails are a common waste of space and Modify ePub deletes those no problem.
JSWolf is offline   Reply With Quote
Old 07-17-2014, 11:28 AM   #12
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,057
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Tex2002ans View Post
  • Overly large/badly compressed covers/images.
  • A "phantom cover"
    • I have come across a few purchased books where there is a useless second cover inside of the book, which isn't referenced or anything, just takes up space.
  • When the EPUB was packaged, they didn't compress it, or didn't compress to the maximum level (this one just baffles me).
    • I have come across a few purchased books which had zero compression on them.
  • Bad code/formatting could bloat the EPUB (oh the horrors I have seen).
  • Fonts that are not subsetted
  • Fonts that aren't used/referenced
    • For example, including a bold font when no bold is used, or a smallcaps font when no smallcaps are used.
Your list is fairly complete, I see you have been working on this issue a few times

I will add: Retailer demanded Bloat. Thumbnail covers, Hi-Def covers.

The best gain on sub-setting a font would be for those 'Display' fonts that might be only used for Chapter titles, Initial Letters.

Is there a EASY way to later determine that the books font file has been sub-setted ( I was thinking of a Quality check PI type test)?
theducks is offline   Reply With Quote
Old 07-17-2014, 02:21 PM   #13
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by theducks View Post
Your list is fairly complete, I see you have been working on this issue a few times

I will add: Retailer demanded Bloat. Thumbnail covers, Hi-Def covers.

The best gain on sub-setting a font would be for those 'Display' fonts that might be only used for Chapter titles, Initial Letters.
...
... rr any OpenType font containing multiple character sets, especially those containing unused complex sets (e.g. the various Japanese kana, Chinese, ...) which can be huge when complete. With these, subsetting a font used for the main body text can result in a massive size reduction.
dwig is offline   Reply With Quote
Old 07-17-2014, 02:49 PM   #14
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,756
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
CharisSIL is a large font because of how many extended characters is contains. Most of that is unused. So when you subset, you get rid of the fonts not used and the characters not used. Most of the time, you will get rid of the bold italic version of a font. You can cut down the size of CharisSIL (on average) to between 200-300K verses about 1.2-1.3MB per font file.

As for the large graphics, this is because people read tablets with high resolution screens and tiny 800x600 sized graphics just don't cut it all that well.
JSWolf is offline   Reply With Quote
Old 07-18-2014, 03:07 AM   #15
Joques
Addict
Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.Joques is my name, but call me Ishmael.
 
Posts: 294
Karma: 107414
Join Date: May 2013
Device: Kobo Glo
Thanks everybody for the responses! I am sorry I posted in the wrong forum - but it was because though I personally use epubs, I assumed this was a universal problem that plagued all formats.

I _do_ try to find the highest-res cover art that I can for all my books, but when converted to B/W they don't take up all that much space. I'll experiment a little with removing embedded fonts with Calibre, but cleaning up bloated CSS styles for 900 books is not going to happen :-)
Joques is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre ebook-viewer.exe changes EPUB file sizes? avid01 Calibre 23 04-11-2018 04:24 AM
File sizes - why the difference? Araucaria Sigil 4 11-22-2011 07:52 PM
Book/file sizes in Calibre cavgirl Calibre 2 11-12-2010 08:15 PM
Epub file sizes jerryleejr Sony Reader 6 07-28-2008 03:09 PM


All times are GMT -4. The time now is 12:53 AM.


MobileRead.com is a privately owned, operated and funded community.