View Single Post
Old 02-15-2013, 01:04 AM   #5
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by shmendrapolk View Post
Scan them in to the PC as jpegs (or as a pdf)
You don't want to save scanned documents as jpg. JPG is a lossy format, and is pretty atrocious on text documents. Since the Xerox already outputs as PDF, I would recommend that. Other formats that can be used for the original scans are any of the lossless image formats such as PNG or TIFF.

Quote:
Originally Posted by shmendrapolk View Post
I'm sure this sounds tedious to you, but trust me, when it came time to writing my dissertation (2006) having all my material scanned into the computer (and having two monitors) made life considerably easier.
No way! I understand completely. Digital files that are properly OCRed are much easier to use than physical books. Searching through documents/entire books is a breeze! So many times with the physical book I got stuck on "well I remember him mentioning something about topic X... now which page was that in the book?"

Quote:
Originally Posted by shmendrapolk View Post
no serious time wasted transcribing hundreds of quotes, half of which I didn't end up using; and having all my material stored in a flash drive so I could write wherever I was.
Being able to copy and paste alone probably saves massive amounts of time. So boring having to type out a paragraph or two of text out of a physical book!

Quote:
Originally Posted by shmendrapolk View Post
Some things to note:
-It's the xerox machine that sends it as a "compact pdf". It's one of the settings. What it does exactly I have no idea, but an otherwise 10-15mb file becomes less than 2mb if I select compact. i can see no difference in the results. And I had no trouble running the compact pdf through OCR.
What "compact PDF" most likely does is just run some lossless compression on the scans resulting in no loss in quality. While just exporting as a "normal PDF" would be exporting the uncompressed image files.

Quote:
Originally Posted by shmendrapolk View Post
But can someone explain the differences between "searchable text image," and "editable text" and what is at stake between choosing one over the other?
I cannot make one bit of sense out the documentation (I see what you mean by "not being very helpful"):

http://nitropdf.helpmax.net/en/tasks...-existing-pdf/

Quote:
Originally Posted by shmendrapolk View Post
And whether removing embedded fonts matters or not
Well since this is only for your own personal usage, and if you are ok without the embedded fonts.... then remove the fonts for much smaller files.

Quote:
Originally Posted by shmendrapolk View Post
I know a jpeg will never be an issue.
Oh yes it will be! Soon your eyes will go bad and you will want to zoom in to the text and all you will see is hideous pixelated blobs.

Quote:
Originally Posted by shmendrapolk View Post
But with these PDFs, I have no idea.
I don't see PDFs disappearing any time soon.
Tex2002ans is offline   Reply With Quote