Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 10-02-2017, 12:28 PM   #1
Aesys
Junior Member
Aesys began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2017
Device: Google Books / Kindle App
Post processing scanned PDF's

Hi


I got recommended to post this on mobile read, so hopefully you lovely peeps can help.

Couple of questions, that actually converge in regards to post processing.

1 - I downloaded a load of out of copyright books from Archive.org.
See an example at https://archive.org/details/receiptbook00rolf
In the above example, the book has been photographed and it's paper colour is included within the PDF. When I scan my own books, my Brother MFC scanner has a background removal feature, that does a great job, but I can't find an equivalent for things already scanned.
Foxit PhantomPDF has a 'convert to' function, but with the old fonts and lower scan quality, screws things up big time.

2 - There have been a couple of items and books that I have destructively scanned through said Brother MFC printer scanner. The only options on it's scan is either black white or colour. However when you have a black white printed book and nearly every page has a colour image, the whole scan needs to be colour, which over a couple of hundred pages increases the size of the PDF.
I remember an old scanner which had a black / white and colour setting, but not this new one.

3 - The brother MFC printer scanner, scans strictly A4 or A3. So if a book or document is a little smaller, there are edge markings / gaps. I have spoken to Brother and the non cropping is a feature of the device (lol), not a setting to change. Any recommendations on how I post process these.

I have Foxit Phantom PDF, which does a great job for OCR and re-sizing, but I cannot seem to convert books to black white, or get it to change the spec to black white & colour, or even crop a pdf.
I am aware that Photoshop can do these tasks manually, from an image extract of the pdf, but how do I automate a variable process, especially when there are a couple of hundred pages?

Is there other process's or software that can be used. I have the above, but am unwilling to spend on Acrobat or other software until I know my issues 'will' be resolved, and of course free or cheaper solutions are preferred.

I currently bootcamp Windows as my default OS, so am happy to triple boot Linux, but some strict instructions are necessary, as last time I got involved in command line, I had to re-install my Windows partition.

I would like to non-destructively photo scan a number of my books, but the above is worrying me in spending time trying to resolve, before I invest time and money building a book scanner.


Ta

DB Link for scanned example I would like to clean.
https://www.dropbox.com/s/87fpnx96lg...ement.pdf?dl=0

Last edited by Aesys; 10-02-2017 at 12:34 PM. Reason: Added DB Link for example to clean
Aesys is offline   Reply With Quote
Old 10-02-2017, 01:10 PM   #2
orebmur
Veteran Linux user
orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.orebmur ought to be getting tired of karma fortunes by now.
 
Posts: 144
Karma: 678910
Join Date: Mar 2017
Location: Barcelona/Spain
Device: Boyue Likebook Note & Mimas, Hisense A5, hopefully soon a PineNote
If you never before heard of scantailor, it is about time.
Great software, enabled me to successfully split and clean up some scanned PDF files.
Check out github.com/scantailor/scantailor/wiki for documentation and some screenshots.

Last edited by orebmur; 10-02-2017 at 01:12 PM.
orebmur is offline   Reply With Quote
Old 10-03-2017, 03:42 PM   #3
Aesys
Junior Member
Aesys began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2017
Device: Google Books / Kindle App
Quote:
Originally Posted by orebmur View Post
If you never before heard of scantailor, it is about time.
Great software, enabled me to successfully split and clean up some scanned PDF files.
Check out github.com/scantailor/scantailor/wiki for documentation and some screenshots.

Many thanks, will be looking at trying this, but looks perfect.
Aesys is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
chm to pdf, appears as scanned in some pdf softwares syriaccj Calibre 0 05-19-2013 02:51 PM
scanned pdf excalibra PDF 5 04-08-2011 04:41 AM
Any way to open a PDF in ABBYY 9.0 without actually processing the pages? Ea Workshop 3 03-07-2010 05:52 AM
processing scanned data into nice pdfs. axel77 iRex 17 03-20-2008 08:33 PM


All times are GMT -4. The time now is 08:11 PM.


MobileRead.com is a privately owned, operated and funded community.