![]() |
#1 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Oct 2017
Device: Google Books / Kindle App
|
Post processing scanned PDF's
Hi
![]() I got recommended to post this on mobile read, so hopefully you lovely peeps can help. Couple of questions, that actually converge in regards to post processing. 1 - I downloaded a load of out of copyright books from Archive.org. See an example at https://archive.org/details/receiptbook00rolf In the above example, the book has been photographed and it's paper colour is included within the PDF. When I scan my own books, my Brother MFC scanner has a background removal feature, that does a great job, but I can't find an equivalent for things already scanned. Foxit PhantomPDF has a 'convert to' function, but with the old fonts and lower scan quality, screws things up big time. 2 - There have been a couple of items and books that I have destructively scanned through said Brother MFC printer scanner. The only options on it's scan is either black white or colour. However when you have a black white printed book and nearly every page has a colour image, the whole scan needs to be colour, which over a couple of hundred pages increases the size of the PDF. I remember an old scanner which had a black / white and colour setting, but not this new one. 3 - The brother MFC printer scanner, scans strictly A4 or A3. So if a book or document is a little smaller, there are edge markings / gaps. I have spoken to Brother and the non cropping is a feature of the device (lol), not a setting to change. Any recommendations on how I post process these. I have Foxit Phantom PDF, which does a great job for OCR and re-sizing, but I cannot seem to convert books to black white, or get it to change the spec to black white & colour, or even crop a pdf. I am aware that Photoshop can do these tasks manually, from an image extract of the pdf, but how do I automate a variable process, especially when there are a couple of hundred pages? Is there other process's or software that can be used. I have the above, but am unwilling to spend on Acrobat or other software until I know my issues 'will' be resolved, and of course free or cheaper solutions are preferred. I currently bootcamp Windows as my default OS, so am happy to triple boot Linux, but some strict instructions are necessary, as last time I got involved in command line, I had to re-install my Windows partition. ![]() I would like to non-destructively photo scan a number of my books, but the above is worrying me in spending time trying to resolve, before I invest time and money building a book scanner. ![]() ![]() Ta DB Link for scanned example I would like to clean. https://www.dropbox.com/s/87fpnx96lg...ement.pdf?dl=0 Last edited by Aesys; 10-02-2017 at 12:34 PM. Reason: Added DB Link for example to clean |
![]() |
![]() |
![]() |
#2 |
Veteran Linux user
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 150
Karma: 1000000
Join Date: Mar 2017
Location: Barcelona/Spain
Device: Boyue Likebook Note & Mimas, Hisense A5, hopefully soon a PineNote
|
If you never before heard of scantailor, it is about time.
Great software, enabled me to successfully split and clean up some scanned PDF files. Check out github.com/scantailor/scantailor/wiki for documentation and some screenshots. Last edited by orebmur; 10-02-2017 at 01:12 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Oct 2017
Device: Google Books / Kindle App
|
Quote:
Many thanks, will be looking at trying this, but looks perfect. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
chm to pdf, appears as scanned in some pdf softwares | syriaccj | Calibre | 0 | 05-19-2013 02:51 PM |
scanned pdf | excalibra | 5 | 04-08-2011 04:41 AM | |
Any way to open a PDF in ABBYY 9.0 without actually processing the pages? | Ea | Workshop | 3 | 03-07-2010 05:52 AM |
processing scanned data into nice pdfs. | axel77 | iRex | 17 | 03-20-2008 08:33 PM |