View Single Post
Old 09-22-2013, 12:19 AM   #8
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by RbnJrg View Post
I'm not agree with you; .gif images is -maybe- the best option. Of course, .jpg images is the worse election.
I just want to stress again... avoid JPG for artificial images (tables/charts/graphs)!!!

GIF is good, but PNG is great!

There are only 2.5 areas where GIF has an advantage over PNG:
  • Animation
    • Not relevant in ebooks
  • Very small icons with no transparency
  • (.5) Works in ancient browsers. (GIF was created in 1987, PNG in 1996)
    • There are potentially some very old ereaders that can't handle PNG
    • PNG is a part of the EPUB spec though, so these devices are most likely pre-EPUB.

https://en.wikipedia.org/wiki/Portab...ompared_to_GIF

For some lively discussion about the topic: http://stackoverflow.com/questions/1...ges-gif-or-png

There is only one bug that you have to keep in mind with PNG images (only applicable to Kindles): Kindlegen cannot handle transparency in PNGs (converts transparency to a black background).

I have not stumbled upon one case of an artificial image (table, chart, diagram, figure) where GIF was better than PNG. PNG can handle every case where a GIF can be used, PLUS more.

Quote:
Originally Posted by Section8 View Post
Thanks for all the info. The pdf I am converted was downloaded from http://cybertracker.org. It was released under a creative commons license and I am converting it mainly for my own use, but might upload it to the MobileRead library.
Fantastic!!!

The company that I work for, everything is CC3.0 (or public domain).

Most of my EPUB work is done OCRing PDFs of older book scans (Black & White), but I also help convert newer publications as well (so I deal with color charts/graphs/diagrams... and if I am lucky, I get the actual vector source (Those nice charts I had in my ScriptPNG post were generated from the vector files) ).

Quote:
Originally Posted by Section8 View Post
In the original pdf, I think these tables are images (not searchable or selectable as text).
Well, when I run across these, I OCR them and convert them to their HTML equivalents (for the advantages stated above). It takes a nice chunk of time, but I see it as: I spend the time to convert it to HTML ONCE, and it will never have to be converted again.

I mean.. why would you want to lower the quality of your EPUB version because of someone making a bad decision when they exported the PDF? (exporting tables/charts/graphs as non-vector formats).

If the author is still alive, and this PDF was created recently (within the last few decades)... perhaps try to get in contact with the author himself. Perhaps he still has the source files sitting around, and you can generate higher quality tables!

Quote:
Originally Posted by Section8 View Post
These tables and several diagrams came back as .pngs, and the photgraphs are jpegs.
Great great. Although sometimes you have to watch out, sometimes these PNGs went through a lossy conversion somewhere along the line. Once you go lossy, you can never go back!

Here is a real life example of the horrible conversion/JPG artifacting you might run into (and the HUGE filesize of JPG compared to PNG):

Original image was done by the author, and was probably generated by some sort of crappy PDF -> image conversion (800+ KB JPG). Artifacting left and right, and that filesize should make you gasp!!

Click image for larger version

Name:	pg209Original.jpg
Views:	926
Size:	858.0 KB
ID:	111779

I got the source document from the author, and was able to generate a PNG (42.9 KB)):

Click image for larger version

Name:	pg209PNG.png
Views:	827
Size:	43.0 KB
ID:	111778

I must admit, it was a "lossy" PNG conversion since I Indexed it to 4 gray colors.

Grayscale JPG (90 quality) (257 KB):

Click image for larger version

Name:	pg209[90].jpg
Views:	771
Size:	257.1 KB
ID:	111777

Grayscale JPG (80 quality) (203 KB):

Click image for larger version

Name:	pg209[80].jpg
Views:	765
Size:	203.6 KB
ID:	111776

Artifacts between PNG + 90 JPG + 80 JPG:

Click image for larger version

Name:	ArtifactingFredPNG.png
Views:	770
Size:	4.1 KB
ID:	111782Click image for larger version

Name:	ArtifactingFredJPG90.png
Views:	776
Size:	12.1 KB
ID:	111781Click image for larger version

Name:	ArtifactingFredJPG80.png
Views:	728
Size:	12.8 KB
ID:	111780

As you can see, the "halo"ing gets worse and worse the lower quality you go with JPG.

A GIF would look exactly like the PNG version (no haloing artifacts), BUT the GIF will have a larger filesize.

Anyway, this entire topic reminded me of this book with a very large Appendix FULL of tables. One of these days, I will go back and "verticalize" them.

PDF Scan:
Click image for larger version

Name:	OriginalPDF.png
Views:	889
Size:	97.1 KB
ID:	111773

EPUB with Images of Tables:
Click image for larger version

Name:	ImageEPUB.png
Views:	1153
Size:	173.3 KB
ID:	111772

EPUB with HTML Tables:
Click image for larger version

Name:	HTMLEPUB.png
Views:	1278
Size:	38.4 KB
ID:	111771

The HTML table also has the advantage of footnotes being linked back/forth.

I can attach both versions of the EPUBs if anyone is interested.

EPUB with images: 1.41 MB
EPUB with HTML: 611 KB

Side Note: PDF is just about the WORST format to work backwards from.

Last edited by Tex2002ans; 09-22-2013 at 12:50 AM.
Tex2002ans is offline   Reply With Quote