Summary of some points

Yes, this thread is kind of a tutorial, but nevertheless it is nice to get a feedback from the readers. So here is summary of some points.

1. Scanner vs. digital camera

Scanner is ok for single page documents, and a scanner with a feeder is very good for a pile of single sheet documents. Scanners can also be used for scanning your old photos to digital form. They do not require any lighting arrangements and take care of the white balance (usually produce white background as white - but not always). However, getting the right colors with a scanner requires as much care and processing as with a digital camera. You can learn a lot about scanning from the author of Scanview and the links on his pages:

I have a HP scanner on my desk that I don't use anymore (I dont't use a photocopier anymore either). The main reason is that the scanner is very slow in color mode. The mono mode is faster but still too slow for me. The other reason is that you cannot spread flat a book in the scanner even if you use a heavy weight on the scanner's cover. If a book page is not flat, the lines of text will be curved, and OCR will be unreliable.

Sometimes (e.g. for professional printing) you may need a scanner resolution of 1200 dpi or more. Few (very expensive) digital cameras can match that resolution for A4 pages today.

2. Digital cameras

I started with a 1.2 Mpixel digital camera only five years ago. Totally useless. (Even before that I had an opportunity to casually talk about the idea of using digital cameras as scanners with some high level HP executives. They didn't "get the picture".)

My next camera was 5 Mpixel Canon PowerShot G5. Image sensor 1/1.8 inch CCD. Image size up to 2592 x 1944 pixels. Very good.

My current camera is 8 Mpixel CanonPowerShot Pro 1. Image Sensor 2/3 inch CCD. Image size up to 3264 x 2448 pixels. Canon does not want my money for the successor of Pro 1. Apparently it competes with high priced reflex cameras.

Most Canon cameras can be connected to a computer to shoot pictures (not just to download them). Most reviewers negelect this advantage and also do not understand the need for higher resolution and never refer to repro applications in their reviews. While 8 Mpixels are good enough for A4 pages (210x297 millimeters) and more than good (with zoom) for A5 pages (you can calculate the resulting resolution from image size and paper size), it would be good to have as high resolution as possible for A3 and even A2 pages. As a matter of fact with my Canon I can shoot a picture of a distant object over the heads of the crowd and read the text which I cannot see well with my eyes. It is also the only way to read small print instructions that come with some products (recently I used my camera LCD to read a serial number on my Archos 704 wifi). The higher resolution (in terms in Mpixels) cameras should come with larger sensors. Higher resolution with the same size sensor is sometimes a disadvantage for repro applications. A nice feature for repro applicatons would be the manual focus. It is very awkard in my Canon (requires three hands).

When shooting, always use a neutral mode (no color or contrast improvement in the camera - you can always improve it with proper software) and the lowest ISO possible at a given lighting condition. In my experience, it is better to underexpose than to overexpose. Always take care of proper focus. For all practical reasons, you cannot improve on that once the shot is taken.

As for cheaper compact cameras, it looks from the specification that 10 Mpixel Canon A640 or a succesor may be good for repro applications.

There are cameras available with a special text mode. It is just a selection of camera settings. Good for novices.

My other two cameras: 8 Mpixel Casio Ex-Z850 and 10 Mpixel Casio Ex-Z1000 provide the text mode. They have a very nice 9 point focusing screen. Since they use the same size sensor, the 8 Mpixel Casio is better for repro applications. It is more difficult to focus with the higher resolution Casio.

3. Lighting

With the v-cradle design use a bright light as high over the cradle as possible. A diffusor to dissipate lighting and make it more uniform is useful. A soft flash (possibly redirected against the ceiling) can be used for black and white pages. The most uniform lighting you can get is the sunlight outdoors, with the original paper document at a proper angle to the sun..

4. OCR

Abby Finereader 8 is a marvelous piece of software but sometimes it is vicious and puts strange and obscene words in the recognized text (it is just the way it interprets the picture). It is very risky to use the OCR-ed document without correcting the misspellings manually. I never waste time for correction since I produce the output as the picture-true pdf copy of the original with OCR-ed text layer underneath. In that way you can index and search the documents in your computer and always read and print a perfect copy of the original document.
