View Single Post
Old 10-02-2017, 12:28 PM   #1
Aesys
Junior Member
Aesys began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2017
Device: Google Books / Kindle App
Post processing scanned PDF's

Hi


I got recommended to post this on mobile read, so hopefully you lovely peeps can help.

Couple of questions, that actually converge in regards to post processing.

1 - I downloaded a load of out of copyright books from Archive.org.
See an example at https://archive.org/details/receiptbook00rolf
In the above example, the book has been photographed and it's paper colour is included within the PDF. When I scan my own books, my Brother MFC scanner has a background removal feature, that does a great job, but I can't find an equivalent for things already scanned.
Foxit PhantomPDF has a 'convert to' function, but with the old fonts and lower scan quality, screws things up big time.

2 - There have been a couple of items and books that I have destructively scanned through said Brother MFC printer scanner. The only options on it's scan is either black white or colour. However when you have a black white printed book and nearly every page has a colour image, the whole scan needs to be colour, which over a couple of hundred pages increases the size of the PDF.
I remember an old scanner which had a black / white and colour setting, but not this new one.

3 - The brother MFC printer scanner, scans strictly A4 or A3. So if a book or document is a little smaller, there are edge markings / gaps. I have spoken to Brother and the non cropping is a feature of the device (lol), not a setting to change. Any recommendations on how I post process these.

I have Foxit Phantom PDF, which does a great job for OCR and re-sizing, but I cannot seem to convert books to black white, or get it to change the spec to black white & colour, or even crop a pdf.
I am aware that Photoshop can do these tasks manually, from an image extract of the pdf, but how do I automate a variable process, especially when there are a couple of hundred pages?

Is there other process's or software that can be used. I have the above, but am unwilling to spend on Acrobat or other software until I know my issues 'will' be resolved, and of course free or cheaper solutions are preferred.

I currently bootcamp Windows as my default OS, so am happy to triple boot Linux, but some strict instructions are necessary, as last time I got involved in command line, I had to re-install my Windows partition.

I would like to non-destructively photo scan a number of my books, but the above is worrying me in spending time trying to resolve, before I invest time and money building a book scanner.


Ta

DB Link for scanned example I would like to clean.
https://www.dropbox.com/s/87fpnx96lg...ement.pdf?dl=0

Last edited by Aesys; 10-02-2017 at 12:34 PM. Reason: Added DB Link for example to clean
Aesys is offline   Reply With Quote