View Full Version : Reducing the size of epub files?


Corpsegoddess
01-02-2013, 08:06 AM
Hey all...

I tried doing a search but couldn't find anything.

Is there a way to reduce the size of larger epub files? I have a few that seem to be very large, even though they don't contain illustrations or anything like that. They're generally the kepubs I buy from Kobo--I've noticed those tend to be on the larger side.

Not sure if this is a silly question or not, but I figured it couldn't hurt to ask. :p

mrmikel
01-02-2013, 08:38 AM
Are you sure contain no images? Open them up with a zip program or Sigil and see if any images are shown. It may be that some pages have been saved as images instead of converted to text.

You might also check the stylesheet and see how many items are there. If it is converted poorly from Word, it could be bloated with huge numbers of extra styles which take up space in the stylesheet and even more in the text.

Corpsegoddess
01-02-2013, 08:46 AM
Hmm.

I'm very sure that the ones I have in mind don't contain illustrations, so I'll have to figure out working with the stylesheet, which I haven't played with yet. And I'm new to Sigil, so that will take a bit of mucking about as well.

Thanks for the response!

DSpider
01-02-2013, 08:47 AM
There's also the issue with embedded fonts. Acrobat can subset glyphs from a font for PDF files, whereas ePub files will contain the entire font. This means that if you're only using one character from a 600 KB font file, the entire font file will be embedded, where it would only take 6-10 KB in a subset. And if you use 10 different 600 KB font families... You get the idea.

But in general, images usually take up the most amount of space.

mrmikel
01-02-2013, 11:34 AM
The images may not be illustrations, but whole pages. Or there may be hidden spacers.

Just start Sigil, Open the file and look in the left side under the book browser. There will be a folder for Styles. If there is a plus on it, click it and any stylesheets will appear. Double click on any one of them and it will appear in the main window.
Then you can look at it to see what is there.

Otherwise, click on any + that is with the Images folder and see what pops up.

Ditto with the fonts folder.

You don't have to know what to do about anything you find, but at least it will put you on the track of what might be making the file large.

DiapDealer
01-02-2013, 12:03 PM
They're generally the kepubs I buy from Kobo--I've noticed those tend to be on the larger side.
Kepub is not epub. No matter how similar the names sound. Keypubs contain database-like elements for all kinds of extra "features." If file-size is that much of a concern, then re-download the books from Kobo in the standard epub format instead of their proprietary keypub format. I think you'll find the size more like what you would expect, based on content.

JSWolf
01-08-2013, 06:39 PM
There's also the issue with embedded fonts. Acrobat can subset glyphs from a font for PDF files, whereas ePub files will contain the entire font. This means that if you're only using one character from a 600 KB font file, the entire font file will be embedded, where it would only take 6-10 KB in a subset. And if you use 10 different 600 KB font families... You get the idea.

But in general, images usually take up the most amount of space.

Calibre allows font subsetting. It works very well and can reduce the size of the embedded fonts significantly. Also, if you have fonts embedded that do not get used, they get removed for an even greater size reduction.

What I do is take the finished ePub, load it into Calibre, convert to ePub using subsetting and then I take the font directory out of the converted ePub, replace the font directory in the finished ePub, fix the CSS/OPF as needed to match the fonts and done. Now I have an much smaller ePub with subsetted fonts.

Toxaris
01-09-2013, 05:39 AM
Calibre does? That would be nice. Perhaps I should look up the part in the source that does the subsetting and see if I can use it outside Calibre.

JSWolf
01-09-2013, 09:14 PM
Calibre does? That would be nice. Perhaps I should look up the part in the source that does the subsetting and see if I can use it outside Calibre.

It does and it works in both ADE and KF8 (at least on eInk Kindles).