|09-03-2011, 06:13 PM||#1|
Join Date: Jun 2010
Cropping Double-Paged Scans for PDF
I have been scanning books for over 10 years now - using the traditional flat-bed scanners (not the hyper sophisticated ones they have for library archives). Thus I have had books that turn out with the two pages open before you for each page in a PDF file.
I have had my own work-around for cropping them and then converting them into single page PDFs for years using Photoshop (see below). I also discovered using Ubuntu that there is a program that specifically performs this function (but I can't recall the name at the moment - and am not at an Ubuntu machine - but see below for how you can identify that program).
So I was searching today to see if there were any programs or tricks out there others knew about for PC's or Macs (particularly PC's).
I saw this post - from three years ago - and then a few others about that old:
"Originally Posted by joeanne12 2009
Hi have searched for an answer but cant find one. I have a couple of pdf books that have two pages printed on the same page and cannot convert them successfully. I cant even seperated the pages, does anyone have any idea how this can be done?
This comes YEARS after this thread. But I thought I'd add something - as I am currently looking for a solution for use on a PC.
So here are my two solutions (PC and Mac using Photoshop & Ubuntu - but can't recall the program name). Below that I post a few other suggestions that I discovered but have yet to try. I am going to try this in ABBY FineReader first. But there are also apparently Briss, Snapter, and PaperCrop that might work and someone offered a Perl script (if the link still works). There is also at the bottom a method using Acrobat to print to a new PDF file - which I was trying to figure out before I found these instructions. I'll give it a shot as below is a very clear description of it.
I have been scanning my books for years. The way I had performed this reduction from the two-page scan to single pages was to create a batch process in Photoshop that divides the page 50% left and saves it - with a sequential page number and an ".l" - and then running a second batch that divides the page 50% right - and similarly saves it with an ".r"
So I get a directory of docX.001.l, docX.001.r, ... docX.n.l, docX.n.r - I then simply import the image files into Acrobat.
It takes some time - but with the batch processes (I haven't been able to create one that does both left and right with one process - but I am not that sophisticated in programming photoshop) it is fairly easy and painless. It will just take a while for a lengthy book to be split into twice as many image files and saved - and then imported into acrobat.
NOW - I am still looking to see if anyone has come up with a better way - and I am going to try ABBY FineReader. Have a copy - but never used it - will see if it can do it.
BUT I do have GOOD NEWS (Just not as complete as I wish). On the computers that I run unix (using Ubuntu) there is an application that does exactly this. Its just that I cannot remember the name. I do recall though that if you use Ubuntu's built in software installer and search for PDF - you will find it that way.
Hope this helps. I will post a new post as well so others might be able to benefit from this - or if others have new information.
Here were the other suggestions given to date - I haven't tried them yet:
Finereader Pro 9 (http://www.abbyy.com) and Snapter (http://www.snapter.atiz.com) can do this.
I managed to get some readable documents by using PaperCrop (see http://www.mobileread.com/forums/showthread.php?t=31677)
You could try BRISS ( https://sourceforge.net/projects/briss/ )
Old 06-28-2010, 08:46 AM #4
If you have Perl installed on your machine, you can use this script of mine.
You can use Adobe Acrobat (not the Reader):
1. Choose Adobe PDF printer
2. Set Page Scaling to "Tile Large Pages", set Tile Scale to 100% and overlap to 0
3. Print the document with Adobe PDF printer as a new .pdf file
|09-03-2011, 09:43 PM||#3|
Join Date: Jun 2010
OK - the results for anyone interested. I am never going to use my manual photoshop method again!!! Nor will I try the somewhat convoluted internal acrobat method I posted (although it might work - I also found someone post another method http://www.mobileread.com/forums/sho...d.php?t=135660).
But that is because I found the PERFECT solutions.
First - ABBY FineReader is far more interesting than I thought - and I am going to likely switch from my Omnipage days to it from now on for OCR and Scanning software (unless I find a way to get a BookDrive scanning system - just discovered that - wish I had seen someone selling their mini unit a few months back - I would have scooped it up ... then I'd try their software. They have a scanning and an editing program - but they also publish Snapter. It looks very promising for cleaning up book scans - particularly those with bent pages (the bend when it is open flat on two pages) - but I don't see how it crops ... at least not quickly and easily. From what I gathered in trying it ... not getting far though ... it does a single page at a time.).
The nice thing about it (ABBY) - is that it will take a double-paged PDF scan and recognize it - and automatically crop/split the pages. I am working on a 600 page book right now - so I haven't seen the final product - but it is taking so long because of the OCR not necessarily the crop.
But for simply cropping - two of the tools that I posted I have tried and they both work like a charm.
I'd give them both two thumbs up. I used both using the graphical interface - although I believe they can both be operated via the command line.
Of the two - one is completely free (Briss), the other is effectively so - but the $35 license is certainly reasonable if you are going to use it often. It works well enough to justify purchasing it.
I'll try and compare these two then: The output of these are identical. The speed of processing is nearly so - but I believe the A-PDF produced the final PDF a bit quicker (but we are talking seconds here - so either way it was quick ... for a 600 page book (300 double sided pages originally).
Both were very easy to use. Briss (which is downloadable through Softforge) is not as pretty or intuitive a user interface. Very basic. But it has an interesting feature. It analyzes the whole document in terms of the two facing pages - and then asks you to set the crop margins - by overlaying multiple pages that are similar on top of each other. You then adjust a grey selection box to set the margins you want to keep in the crop.
That is an amazing feature - not in A-PDF. If you have one of those books that has extreme variations in the original scan (especially if you have large books and old books and you were careful with the spine in scanning them - then in the middle you often have different dimensions ... i.e. the bend in the book causes, for example, the outer margins to be closer to the center) ... this is the perfect tool. Also for when you have scans where there is a good deal and variety of skewing of the page.
So in my 300 converted to 600 page book, as an example, it reduced all 300 double sided images to three categories for setting the crop for the two new pages to be produced. And it tells you exactly what pages (when you mouse over) are included in each set. I set each of the three sets of margins accordingly - and pushed crop - and in less than a minute I had a new, perfect, single paged document.
The controls are limited. Two menu items (File and Action). Pretty much you load the PDF file, it scans it in (it took longer than generating the final PDF to do this - which makes sense given its analysis of each page into the area with text in sets) and gives you an initial crop box, you then adjust those, and push crop and in a few seconds you have a perfect PDF.
It has one other useful feature - you can exclude pages. You just set the numbers or ranges. Here is the link for downloading: https://sourceforge.net/projects/briss/
I give it a 9/10. The one difficulty I had at first was figuring out how to set the crop margins. It puts them there first - and it numbers them - but in my first attempt it had one square across both pages. I couldn't initially figure out how to draw a new crop square - although the instructions say to use left click and scroll. I couldn't get that to work. But I was able to simply select and copy and paste - after reducing the originating square to the first page - and use it for the second. Did the same for the three sets - and it worked perfect. Otherwise it would be closer to a 9.8!! Especially because it is totally free.
Now for A-PDF Page Crop. Again you can operate it via a command line - but I used the graphical interface. It is much more advanced/complicated than Briss. It has a traditional menu bar as well as a viewing area and thumbnail area very similar to Acrobat.
You can also tweak it a bit more in several different ways - that could come in handy for other than a routine cropping.
Again you simply select your PDF and it opens it up. Here you scroll down the pages - just like in Acrobat. Find a good average double page. Then you click a tool to draw a crop square. Do it for both sides of the double page.
Then it has a nice feature. You can tell it to use that crop setting for all pages (or a range). Thus you only have to do it once - and it will uniformly apply to all pages.
Of course - if you have document that varies internally - you can set the pages individually. You get to see the exact image of each page. You can also draw rectangles to divide the page up into quarters for example.
It also has options for automatically drawing a bleed area - and for trimming blank areas. I haven't played with that yet - but a nice feature.
There is also a way to write rules for importing and exporting - but I haven't explored that.
I'd give this a 9.5. If it were entirely free I'd bump it up to 9.9.
Here is the link for it: http://www.a-pdf.com/buy.htm
All in all - these two last options work perfectly - I wish I had known about them all these years I have been doing this manually with photoshop. Would have saved me probably two years on my life!!
|09-03-2011, 10:27 PM||#4|
Join Date: Jun 2010
I forgot to mention. You can generate a PREVIEW document using Briss. That was a nice feature. This would also be a nice feature if it were added to A-PDF Page Crop - although it is less essential because of the difference in the way the program views/shows you the pdf in the main screen. As I mentioned - a nice little feature of Briss is the batch analysis it does for you of all the pages - so that were the scan isn't necessarily uniform - it is easy to set different margins for the different sets quickly and easily. With A-PDF you instead get effectively the view of Acrobat - a main viewing/reading area (where you see the particular page you are on - and can scroll through the entire document) and a sidebar with thumbnails. That too is a nice feature. So you can effectively see what the PDF is going to look like before you process it - unlike in Briss. I still think - for "double checking" purposes - it would make an nice addition to A-PDF.
Also let me share with you what I found useful going back and starting to use the latter for more intricate editing choices before generating the new file (like bleed and trim). A very nice feature of the Acrobat-like interface is that it has on the menu bar a set of alignment tools as well as the alignment coordinates. So you can be EXACT in creating the two crops (each page) if you want/need.
There is also a feature in generating the output that you can set - to either use the dimensions of the crop rectangles - or to set the page dimensions manually (and it then would center your crop rectangle within that). This too will be nice when I start going back and editing some of my old scans. Particularly of the very ancient books - where I couldn't lay the book perfectly flat face-down on the flatbed scanner.
I hope this helps people. I have come across this topic posted in the forums here at MobileRead numerous times. I hope that I have been able to provide a good resource for others looking for this information in the future. [I am actually excited - it is rare that I find such perfect solutions to my problems - and can thus improve on my other improvised techniques!! I am not much of a programmer - so I can't turn my ideas quickly and easily into applications (I'm a philosopher!!!). But clearly the people who created Briss and A-PDF Page Crop both new what a user like me was looking for and how to deliver it. I cannot begin to express how much time this is going to save me - and how so much of the work I have left aside (in terms of OCR) for over a year now because I didn't want to crop/divide the double pages [time, patience, and disc space [it took a lot to first save the PDF to image files, then photoshop them into two sets, then reimport them into a new PDF ... !!!!! I was dealing with hundreds of books (which I was supposed to be reading, annotating, and analyzing ... not just scanning!!!!!). Now I can get back to work!
|10-11-2011, 04:17 AM||#5|
Join Date: Oct 2011
|01-09-2012, 12:53 AM||#6|
Join Date: Jan 2012
Briss is Great
A big thank you to Philosopher for pointing us in the right direction.
I have tried Briss and it does exactly what I need.
Shame on Adobe for not implementing real cropping in Acrobat Pro.
|06-28-2013, 04:43 AM||#8|
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
As it says in the link:
Another similar script
Last edited by markom; 06-28-2013 at 05:46 AM.
|cropping, double page, pdf, scan, single page|
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Solution: Cropping Double Page PDF Files into One-Page PDF files||kgydkgyd||4||06-06-2011 11:45 AM|
|on using dictionaries with pdf scans||teofrast||PocketBook||2||01-27-2011 04:15 PM|
|Sony PRS-600 for PDF Magazine Scans?||andycorleone||Which one should I buy?||9||11-24-2009 05:41 AM|
|PDF Book Scans?||jalm1||Sony Reader||2||02-05-2007 04:48 PM|
|PDF documents made from scans on ebook readers?||claudioita||Sony Reader||7||11-28-2006 09:47 AM|