|01-26-2012, 10:43 AM||#1|
Join Date: Jan 2012
resolution and format conversion
I scanned a textbook on my company Xerox machine into a multiple page searchable PDF files at 400 DPI. Each file contained about two chapters. I then downloaded the PDF’s and separated, collated and renamed the files using PDFill tools and infanview so in the end I have a bunch of single pdf scans. I then converted the pdf’s to tif files using Infanview and wanted to import the tiffs into scantailor to do the postprocessing work. But now it is asking me to fix the dpi with a pop up window before I import the scans to scantailor. It says I need to fix all pages and offers the custom DPI of 96 X 96. The checkbox “fix dpi even if they look ok is not selected“ Not sure of what to select as the dpi and since 96 X 96 seems very low I went witt 400 dpi. But the scans are running through the scan tailor very fast with more errors than the previous book I scanned straight to Tif format.
Any converting and renaming of the files was done with lossless LZW compression.
the tif files now have a horizontal and vertical resolution of 96 dpi
Did I loose any resolution by converting from a searchable pdf to a tif format?
Any thoughts will be greatly appreciated
|01-26-2012, 01:29 PM||#2|
Join Date: Nov 2009
Device: iPod touch 2G (16 GB)
You're doing it wrong! Don't scan straight to PDF.
Try to get as much raw data from the scanner as possible, meaning just the images. Processing comes later. Sure, you could get away with JPG straight from the scanner, if, say, you're only interested in OCR-ing it (extracting the text) using ABBYY FineReader for instance (and proofreading it either in FineReader or side-by-side with the actual thing). But don't use JPG if you're aiming for Scan Tailor processing because most scanners are set by default to something like 85% quality for JPG compression, which could result in unnecessary grain or fuzzy text.
If the book uses images sparingly throughout the book I usually just scan in uncompressed TIFF; I don't bother choosing JPG for the pages with just text on them to save a few megabytes - although sometimes it helps if you're not planning on immediate processing. More room for other projects.
Edit: Oh, I forgot to mention that Scan Tailor was done very quick because there was a lot less data to work with. The images were either down-sampled by the scanner's software during the PDF packaging (to save space) or Irfanview chose 96 DPI by default (look for a checkbox or an option there).
Last edited by DSpider; 01-26-2012 at 01:45 PM.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Zip to any other format conversion (OS X)||Ztrel0cK||Conversion||6||02-25-2012 09:34 AM|
|Best format for EPUB conversion?||GozaSC||Conversion||1||04-07-2011 01:26 AM|
|format conversion||jkp||Workshop||3||03-03-2011 06:42 PM|
|Format conversion and fonts||kressg23||HanLin eBook||5||02-24-2009 08:19 PM|
|Format Conversion...||vikingblade||Workshop||5||06-21-2006 12:19 AM|