View Single Post
Old 02-05-2011, 04:38 AM   #25
review
Addict
review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.review got an A in P-Chem.
 
Posts: 315
Karma: 6448
Join Date: Nov 2010
Device: 903
Quote:
Originally Posted by skydive View Post
My questions are: It is better to take PDF files and converted to DJVU format?
OK, a little rant:

PDF files are containers. They can virtually store anything.
If I scan a book at a decent page resolution (200-400) the tiff images would be extremely large. One A4 page would be 26 MB if scanned at 300dpi and stored uncompressed as tiff. So all pdf creators are using ways to reduce the filesize (because this is how people measure the efficiency of the program). Some reduce the colour space and reduce the resolution etc. but they still contain just plain pictures. Some do OCR and put the text layer as a invisible layer on top of the pictures so that you can effectively search in those files.
If however, the pdf files was created by "printing" from a word document or webpage or something like this the content is transmitted in an abstract way: e.g. the plain text and the font and the position is transmitted to the pdf program. Those pdf files are very, very small as they primarily contain only the plain text and not a single image. The used fonts can even be "embedded" into this file so that the pdf file will look on every computer the same.

So, those two pdf files mentioned above have in itself no a lot in common with eachother. It just so happens that both are called pdf.

Now, in very general terms I would say the following: if the document was scanned from a book and the file is available both as pdf and djvu I usually tend to use the djvu file (also, you will notice on the ebook reader it will be so much faster, and the dictionary will work etc). The file size of both files will be about in the same order of magnitude but much, much smaller than the uncompressed files. So 200 pages of A5 size scanned at 300dpi would be around 2.6 GB of uncompressed files. The pdf creator might reduce the filesize by a factor of 100-150 to something like 20 MB. The djvu file would be something like this as well. However, it might be 30 MB, which is still significantly compressed compared to the original filesize of 2.6 GB. However, the djvu file might have superior results compared to the pdf.

Now coming to your original question: if the book is scanned and you have it only as pdf, should you convert it?
In general terms: I wouldn't convert it. Sure, you can try it. Go to http://any2djvu.djvuzone.org/ and upload your pdf and check the result. But from my little rant above you will know that in order to store it at a small size pdf the images had to be compressed and modified significantly. You shouldn't expect that any converter can make something "bad" good. You have to go a step back to the original uncompressed files and convert those to djvu. Then you can compare the difference between the djvu file and the pdf file.

So, in summary for
  • converting pdf to djvu: Don't do it. The quality of a highly compressed pdf as input file for djvu wouldn't lead to great results for the djvu file.
  • if you scan a book and you have the uncompressed original files, don't use a pdf converter to store that book but use a djvu converter. The result would be far superior at a similar file size
  • if the book is not scanned but comes from another application use pdf as file format


Quote:
Originally Posted by skydive View Post
Is there a book site that can I download DJVU files?
All djvu files I have on my computer are either downloaded from http://www.archive.org/details/texts
or created by me. If you want to find djvu files via google just add the following magic words to your search:
ext:djvu
All results will be djvu files.
review is offline   Reply With Quote