Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 11-17-2022, 09:35 PM   #1
tatagi
Connoisseur
tatagi began at the beginning.
 
Posts: 52
Karma: 10
Join Date: Oct 2022
Device: none
are PDFs smaller in size than EPUBs in general?

for example, my epub file with 5 embedded fonts is sized as high as 20mb, while pdf file for the same book, almost identical gui and fonts included, is just 2mb

the book is plain mystery novel and contains almost no images(cover and subcover only)

is this normal?
tatagi is offline   Reply With Quote
Old 11-17-2022, 09:50 PM   #2
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,353
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
It should not be considered 'normal' in a properly created ePub....but, it all depends on the settings of your pdf and what you have included in your ePub.

- have you subset the fonts so it only includes the sigil's needed?
- have you cleaned all the coding 'bloat' from your html pages?
- have you checked the filesize of your images; are they the proper format, dimensions, and had unnecessary exif data removed?

I would think the last bullet is your primary culprit...IIRC pdf will automatically condense the image???

I am currently working on an ePub with over 200 chapters (over 17 million characters) basic front-matter and back-matter, and it is only 2.5 mb....which I personally consider too large.

Last edited by Turtle91; 11-17-2022 at 09:53 PM.
Turtle91 is offline   Reply With Quote
Old 11-18-2022, 09:14 AM   #3
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,045
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
What Turtle writes ^^^^^

Something is wrong with the epub. Perhaps it has Asian fonts or fonts with Asian glyphs not subsetted.

Or maybe the cover & subcover are huge and got resized in the conversion.

Simplified
1) An epub is a zip file with an index file, a file that's a list of files used, 1 or more HTML files, optional CSS files, optional font files, optional image files. You can rename it to .zip and extract to see the sizes. Or use the Calibre Editor. Regular epub has no concept of pages. The app or device's renderer creates pages on the fly a little like setting a paper size and print preview in a Web Browser. No built in headers or footers. Each HTML file is purely sequential and a control file has the order for the display of the files.

2) A PDF is a kind of envelope. It can have pages and layers. Content can be any mix of raster images, vector images, Postscript, text, fonts. Content isn't sequential but has information on which page and layer. Pages have a physical size. Layers can have transparency. A particular resource might not be displayed sequentially but the items (even down to a single letter) can have placement instructions.

So if all else is equal (and it almost never is), there is little difference between a PDF with text content rather than image based pages and an epub.

Some of my files for one book
  1. odt for docx & epub 2.1 M (fonts not embedded). Primary version. Only this is edited for content.
  2. docx for epub 2.1 M (fonts not embedded)
  3. epub 3.4M (has embedded fonts)
  4. mobi 3.0 M
  5. azw3 4.2 M (has embedded fonts)
  6. jpeg cover 0.9 M (the upload cover for ebook publishing is huge, 7.5 M and for paperback wrap cover is 16M, source is 26 M).
  7. odt for export as PDF 4.1 M (fonts not embedded). Only format or styles are edited.
  8. pdf for paperback is 7.3 M (no cover, includes fonts).

So not much difference. There are 10 images in epub (some are small like QR code & logo). About 160,450 words.
The odt for PDF export has headers, footers and is about 590 pages excluding front matter & contents (about 10 pages).


If you have a 20 M epub and 2 M byte PDF, then one or other is broken.

Also what is the PDF for? An epub is superior for all screens (or mobi for Kindle before 3.4 otherwise azw3 for Kindle). Use Lithium (or Pocketbook if text to speech needed) on Android, some equivalent on iOS (not Apple app "Books"), or some other Android app for Android 3 or 4!

The only use for a PDF version of an ebook is paper printing or paper publishing. Or maybe for a reMarkable or Sony Digital Paper as those are PDF only eink tablets for annotating PDFs with scribbles.
Quoth is offline   Reply With Quote
Old 11-18-2022, 11:01 AM   #4
phillipgessert
Addict
phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.
 
phillipgessert's Avatar
 
Posts: 316
Karma: 3200000
Join Date: Oct 2015
Location: Madison, WI
Device: Kindle 5th Gen
I assume your PDF lacks the cover; and the subcover, which I assume is the title page, is possibly not an image at all in that one. Could just be live text, vector paths, things like that. Additionally, 5 embedded fonts for a plain mystery…this is kind of a weird leap, but was this epub made for you by someone using InDesign? Asking because that setup seems a little unusual for pretty much any workflow other than that one. If so, there are a bunch of other variables that could be in play. I believe InDesign converts certain types of floating objects into images on epub export, for example. And I know you said there are only two images, but if there’s something by like a fleuron, it might be in there over and over again as a bunch of duplicate images in the ebook.

20 MB is still pretty big even with all that stuff in mind though. I’m betting that cover and title page are bigger than they have to be.

Edit: another weird one, but it’s not fixed-layout, is it? I could see that being bigger than reflowable, though I honestly have no clue how their file sizes compare to PDF.

Last edited by phillipgessert; 11-18-2022 at 11:03 AM.
phillipgessert is offline   Reply With Quote
Old 11-18-2022, 01:00 PM   #5
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,045
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Indesign was originally for paper and may be a good solution for magazines or newspapers. Totally a pain for epubs and direct output from Word or LO Writer to PDF can be perfect for novels, thesis etc.

Quote:
but if there’s something by like a fleuron, it might be in there over and over again as a bunch of duplicate images in the ebook.
Ought to exist once unless the ebook creation was stupid.
Quoth is offline   Reply With Quote
Old 11-18-2022, 08:02 PM   #6
tatagi
Connoisseur
tatagi began at the beginning.
 
Posts: 52
Karma: 10
Join Date: Oct 2022
Device: none
Quote:
Originally Posted by Turtle91 View Post
Turtle91
Quote:
Originally Posted by Quoth View Post
Quoth
Quote:
Originally Posted by phillipgessert View Post
phillipgessert

Thank you all for the things to consider that I might possibly miss out on

The reason my epub is exorbitantly bigger than pdf was obvious. I unzipped the epub and check all the files inside. 5 fonts already consumed 25MBs of space(inside fonts folder, it's bigger than total epub size because it's uncompressed), about 1MB for two images(within images folder), and some ncx opf style css files not exceeding few kilobytes. other than that, it's xhtml files to form the text part, which didnt take more than 1.5MB.

I want to upload my file for you to inspect more in depths but it's copyrighted content so I don't think I can. But I think you guys know what went wrong.

It was obviously font that made the file size ridiculously big.
Since the book is written in korean, it's no wonder it was big(11,172 glyphs in total), and a lot of them are in use so subset didn't help much.

What I still don't understand is, all the fonts are rendered correctly in pdf format too, and still it's only 2MBs. Probably PDFs work differently and do not need embedded fonts and it just imitates all the contents(including how each letters are shaped) when converted from epub therefore no need for font files to be embedded? I don't know.

Last edited by tatagi; 11-18-2022 at 08:30 PM. Reason: edited to quote commenters
tatagi is offline   Reply With Quote
Old 11-19-2022, 08:06 AM   #7
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,045
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Embedded fonts are optional in wordprocessors, PDFs and ePubs (also Kindle AZW3). If the ereader, program or app has access to suitable fonts they can be used instead of the embedded font. The CSS in an epub can have a list of possibilities for a font. Also if there is no embedded font and no font on the device or system matches the CSS then there may be GUI option for the user to select a font.

If your ereader or computer normally displays Korean, then the files may work without embedded fonts.

Embedding a font is an option. It should always be done with a PDF unless that PDF is known to be only used on a system that already has the exact same fonts. But the only reason to make PDF from an epub is for printing on paper or publishing.

Quote:
all the fonts are rendered correctly in pdf format too, and still it's only 2MBs. Probably PDFs work differently and do not need embedded fonts and it just imitates all the contents(including how each letters are shaped) when converted from epub therefore no need for font files to be embedded?
It just means that whatever you are viewing/reading on already has the Korean fonts. All PDFs, wordprocessors, ebooks either use system fonts or embedded fonts.

The only way a PDF can work without the fonts is if each page is an image of the original. That will save space for a small Asian document but not for a book.

Since you can read the PDF you have some suitable fonts on the system with the PDF viewer.

It makes no sense to convert a copyright epub to PDF except to read it on an eink that only does PDFs, like a reMarkable or Sony Digital Paper. All smart phones and tablets can have a good epub viewer. Almost all eReaders either do epub or a Kindle format far better than PDF.

Last edited by Quoth; 11-19-2022 at 08:09 AM.
Quoth is offline   Reply With Quote
Old 11-21-2022, 06:26 AM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,758
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
My guess is the fonts for the PDF are subset and not in the ePub. You can subset the fonts with Sigil or Calibre.
JSWolf is offline   Reply With Quote
Old 11-22-2022, 02:21 PM   #9
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,045
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by JSWolf View Post
My guess is the fonts for the PDF are subset and not in the ePub. You can subset the fonts with Sigil or Calibre.
The OP tried that. Korean. The simplest explanation is that the system has the Korean fonts and the PDF has no embedded fonts.
Quoth is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Smaller Epub is BIGGER in size than my longer one? Siegfried ePub 16 02-26-2021 09:19 AM
Aura H2O Font size mismatch (downloaded epubs via Calibre vs. Kobo epubs & articles) Oolong Kobo Reader 36 01-25-2019 06:00 AM
Mantano and PDFs... and PDFs in general! inpariswithyou Apple Devices 5 01-20-2015 11:10 AM
font size smaller mufc Conversion 0 02-01-2013 10:29 AM
The most battery in a smaller size Colin Dunstan Lounge 0 06-23-2005 09:19 AM


All times are GMT -4. The time now is 06:05 PM.


MobileRead.com is a privately owned, operated and funded community.