Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 06-27-2017, 03:54 AM   #1
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Converting ePub to djvu

I apologize from hopping from thread to thread with this djvu subject. Hopefully this thread will be the good one.

Question: I would like to be able to convert an EPUB to a fixed size djvu file as easily as I produce 9×12 PDF using the Prince PDF plugin when a simple click triggers a conversion script managing the options.

1. - about the qualities of the djvu format

What motivates this search is the fact that a djvu file could be much more compact, at least for some illustrated books, compared to a pdf one. If I judge from my limited experience (I use only Prince PDF to convert Epub to PDF) a text-only PDF can reach a very big size compared to a xhtml one. I consider a 200% to 300% size increase to be, on average, acceptable, but sometimes I get much higher percentages. An alternative solution could sometimes be useful and that's why I try to make up my mind about the usability of djvu.

As far as readability is concerned, if one select a good-enough resolution (300 dpi is default for bitonal images), I don't think bitmapped images from djvu can be a problem. Have a look at the wonderful Alice in Wonderland that Wikimedia choose to illustrate the capabilities of this format. Furthermore, even if it cannot compare with the gazillion tools available for the PDF format, the use of djvu, at least with Linux, is still comfortable. Koreader which is my reader tool of choice on my PW3 reads very well PDF but also DJVU files. Note also that the text is easily searchable at least with tools like Djview4.

Alice in Wonderland (attached below). This is a 3.5 MB book containing 114 pages with 57 images (14 of them full screen ), with an average size of 31 k by page. I wonder if one could achieve such a result even with an optimized PDF. I fail to see why it would not be advisable to convert directly some kinds of EPUB books like this one if I obtain a size advantage compared to PDF.

There are dissenting opinions about this though, for example from willus

Quote:
I don't recommend the djvu format for converting your epubs into fixed page format unless your epubs are mostly images. With text-based epubs, converting to djvu creates a bitmap for each page, whereas converting to PDF should store the text from the epub directly as text strings (as convertio does) without the need for bitmaps. This results in a very small PDF file size when converting large, mostly text epubs unless those epubs have a lot of different fonts that get embedded into the PDF. This is probably why the pdf2djvu utility ends up creating a larger djvu file than the original PDF file. There is also the added benefit that the text in the PDF file will render perfectly (with smooth edges) at any magnification, whereas the text in the djvu bitmapped pages will not.

The djvu format is optimized for archiving scanned documents, not for converting epubs.
The same recognizes that the most efficient compression level for PDF (JPEG-2000, or JPX encoding) is reported to be a "viewer killer" due to its slow rendering. A quick rendering would be provided with a PDF five times this size.

2. - about producing djvu files

This is a summary of what I tried.
Many organizations use djvu for the storage of electronic documents because of its size saving features. I did a few tries to check it. As I am a Linux user, I used the handy pdf2djvu conversion tool with a 300 dpi resolution. I got the following results:
- a 7.2 MB pdf became a 13.3 MB djvu. It contains 99.8% text and one cover image.
- a 26.5 KB pdf black and white image (300 dpi) became a 37.6 KB djvu.
- a 2.8 MB pdf colour image (600 dpi) produced by my scanner printer became a 176 KB djvu image (300 dpi)

Out of the third test, these tries do not seem very enticing as far as size saving is concerned.

Using djvudigital (and a compiled gsdjvu from an AUR package), a 1.2 MB 9×12 PDF resulted in a 1.6 MB djvu (300 dpi) which is marginally better than the first try but not yet satisfactory. It contains 99.8% text and one cover image. However, a Gallica PDF scan went down from 22 MB to 18.7 MB when converting to djvu with djvudigital.

So, for the time being, it seems it does not make much sense to start from a pdf to produce a djvu save for some rare use cases like this unoptimized colour image mentioned above.

The problem is that I do not know how to convert directly EPUB to a customized (9×12 cm) djvu format. Up to now, I found that the online site convertio converts directly nice documents in djvu format from epub but does not seem to offer any free choice for dimensions(it produces standard A4 files). It sells a "conversion API" choice but I am not sure if the "ouptput format" option applies also to djvu.

Other sites pretending to convert Epub to djvu do much worse. Sobolsoft uses a two step process, converting first to a temporary PDF and them converting to djvu. As you can read above, It's easy to do the same, for example using Jellby's plugin and then the excellent opensource pdf2djvu or djvudigital software. But as the output djvu file exceeds the PDF size this defeats my initial purpose...

Hopefully one day, a new plugin may appear which may enable us to go directly from epub to djvu format.
Attached Files
File Type: zip Alice_in_Wonderland.djvu.zip (3.38 MB, 163 views)

Last edited by roger64; 06-27-2017 at 06:48 AM. Reason: small mistakes
roger64 is offline   Reply With Quote
Old 06-29-2017, 04:19 AM   #2
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

Well, the encephalogram is pretty flat.

I tried just one more thing using the convertio site. I converted an 9×12 odt book to djvu. After some few minutes, I was able to download a 2.4 MB 9×12 djvu.
The original odt file has a size of 353 k. The 9×12 pdf produced with LibreOffice is a 1.6MB 9×12 file. So the resulting djvu file is way bigger than the pdf.

Furthermore, while the pdf is a very exact copy of the odt, the djvu has some defects for the display of some titles, small caps, etc.

So, unless some new information comes in, I think it's about time to give up my idea.
roger64 is offline   Reply With Quote
Old 06-29-2017, 01:21 PM   #3
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718479
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by roger64 View Post
...
The same recognizes that the most efficient compression level for PDF (JPEG-2000, or JPX encoding) is reported to be a "viewer killer" due to its slow rendering. A quick rendering would be provided with a PDF five times this size. ...
If your Alice in Wonderland has really good copies of the John Tenniel illustrations, you should be using binary bitmap images and CITT compression when generating a PDF.
dwig is offline   Reply With Quote
Old 06-29-2017, 07:57 PM   #4
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by roger64 View Post
Hi

Well, the encephalogram is pretty flat.

I tried just one more thing using the convertio site. I converted an 9×12 odt book to djvu. After some few minutes, I was able to download a 2.4 MB 9×12 djvu.
The original odt file has a size of 353 k. The 9×12 pdf produced with LibreOffice is a 1.6MB 9×12 file. So the resulting djvu file is way bigger than the pdf.

Furthermore, while the pdf is a very exact copy of the odt, the djvu has some defects for the display of some titles, small caps, etc.

So, unless some new information comes in, I think it's about time to give up my idea.
This is the second group you have posted the same idea too and you didn't pay attention to the answer given to you before. If the PDF is pictures of text instead of text then the djVu would indeed be smaller. But if the text is actual text in the PDF then it will be smaller. It would take a lot of black and white line drawings to favor DjVu. PDF can be either pictures of text (archiving of actual books) or it can be text. It can sometimes be hard to tell them apart. Did you look in the wiki as I told you to do?

Dale
DaleDe is offline   Reply With Quote
Old 06-30-2017, 12:45 AM   #5
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@Dalede

I did apologize on the first post of this thread from "hopping from thread to thread" and I did quote willus. I apologize once more. For completeness sake, here are your two quotes:

Quote:
Originally Posted by DaleDe View Post
There are two kinds of PDF's. One is an image (with or without some search text) and the second is pure text which is likely the kind you are testing with. Typically DJVU shines with an set of images that look like text. DJVU looks at the images and breaks them up into reusable graphics items that look for all intents and purposes like a set of fonts and then just references these as needed. This is typically better compression than an image would be, also the latest PDF is now using a better (JP2000) compression than it did formerly.
Dale
Quote:
Originally Posted by DaleDe View Post
You might want to start with our DJVU wiki page.
Dale
I did look at this page and many others not only from MR wiki, believe me. More advisable for beginners is probably this one that I also read carefully.

My initial idea was to hope that since one online service (convertio) proposed to convert directly Epub files to djvu, it was maybe possible to use this format to benefit from its generally acknowledged size-saving qualities.

The tests I did and reported above made me realize that I was wrong. It appears that these qualities appear (or apply) only when converting one image or a set of images. Indeed I did obtain this expected size-saving advantage only with a PDF color image and with the PDF Gallica scan (I guess it would have been the same for other image formats, except maybe for JPEG 2000 but this last one carries its own drawbacks). It's a significant achievement for djvu format in this PDF everywhere world but it did not fulfill my initial purpose.

Last edited by roger64; 06-30-2017 at 01:28 AM. Reason: for djvu format
roger64 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help converting djvu to pdf poliandro Workshop 1 05-08-2015 04:45 PM
Converting Pdf to Djvu? loviedovie Conversion 2 04-14-2015 01:16 PM
Help converting Djvu to mobi Stratogirl Amazon Kindle 3 07-07-2011 09:46 AM
Converting .djvu to .pdf BranMakMorn Amazon Kindle 5 01-21-2011 04:32 PM
Confused about DJVU files and converting to LRF BBRags LRF 4 12-08-2008 04:37 PM


All times are GMT -4. The time now is 12:38 PM.


MobileRead.com is a privately owned, operated and funded community.