MobileRead Forums - View Single Post

tatagi · 02-28-2023, 01:24 AM

As most of people are well aware, in EPUBs we have all the files(text, images, fonts, metadate, table of contents etc) we need that consist of whole document, in a compressed form. EPUB is basically zipped folder.
This isn't the case for PDF files. PDF is just like word document that puts all elements together in a single file, therefore we can't simply replace image1 with image2 using Window's builtin copy & paste command.

So If you want to compress EPUB file, you just open those image files on any image compression tool found on sourceforge or github repository, and run the compression. Easy-peasy.
Most tools work almost the same : For JPG files, we just decide how lossy the quality of the image can get, mostly useful when the image is very complex(like nature) so human can't easily recognize the difference from the original. . For PNG, instead of compression, the number of colors used is limited (most probably from 24 to 8 bits colors) that works very well to get the desired result(reduced file size) with not very colorful images like scanned text, line art, graphs and the like.

But what about PDF files? Since there's no "cover.png" or "image07.jpg" for the pdf, How they know whch algorithms would work best for each image?
For example, Scanned text is much better in quality in PNG-8bit or sometimes even PNG-1bit form than hard compressed JPG that has unavoidable problem : the artifacts
Can Compression tool apply the most optimized compression method for different images in PDF files?

and if possible, please recommend a good compression tool for PDF and EPUB.

Thanks.

02-28-2023, 01:24 AM	#1
tatagi Connoisseur Posts: 52 Karma: 10 Join Date: Oct 2022 Device: none	How does PDF compression work? As most of people are well aware, in EPUBs we have all the files(text, images, fonts, metadate, table of contents etc) we need that consist of whole document, in a compressed form. EPUB is basically zipped folder. This isn't the case for PDF files. PDF is just like word document that puts all elements together in a single file, therefore we can't simply replace image1 with image2 using Window's builtin copy & paste command. So If you want to compress EPUB file, you just open those image files on any image compression tool found on sourceforge or github repository, and run the compression. Easy-peasy. Most tools work almost the same : For JPG files, we just decide how lossy the quality of the image can get, mostly useful when the image is very complex(like nature) so human can't easily recognize the difference from the original. . For PNG, instead of compression, the number of colors used is limited (most probably from 24 to 8 bits colors) that works very well to get the desired result(reduced file size) with not very colorful images like scanned text, line art, graphs and the like. But what about PDF files? Since there's no "cover.png" or "image07.jpg" for the pdf, How they know whch algorithms would work best for each image? For example, Scanned text is much better in quality in PNG-8bit or sometimes even PNG-1bit form than hard compressed JPG that has unavoidable problem : the artifacts Can Compression tool apply the most optimized compression method for different images in PDF files? and if possible, please recommend a good compression tool for PDF and EPUB. Thanks. Last edited by tatagi; 02-28-2023 at 01:28 AM.