View Single Post
Old 01-22-2012, 02:09 PM   #2
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
Ugh... PDF is the worst possible format to convert FROM. It was designed as an output format. This subject has been beaten to death around here because a lot of PDFs aren't tagged PDFs - meaning that letters (and a lot of times small groups of letters) resemble something like floating objects on a blank paper, each with their own coordinates and extra baggage. So it's very difficult to get a 1:1 conversion. A lot of formatting will be lost, some will get interpreted wrong, etc... Doing this in batches and not taking the time to do a proper check is a bad idea. Why do you need them as ePub anyway? Knowing that you could ruin formatting.


The closest thing to what I think you're looking for is Adobe Acrobat's "Save As - Optimized PDF - Audit space usage". An information window will pop up and if it says there that images take up some crazy amount like 98-100%, chances are that the PDF is "image-based". But then again, if the book is chuck full of pictures, the filesize is usually a good indicator too... And you don't need $200 for that, you could simply right click the PDF file and choose Properties.

Also, in any PDF viewer you could press Ctrl+A to select everything and just scroll down a few pages. I'd say if the text in the first 10 pages or so is highlighted in blue (or whatever theme you have set), it's "text-based".

If some pages are images, some are text, then it's a sh*tty PDF.
DSpider is offline   Reply With Quote