Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 05-23-2012, 04:01 PM   #1
hernep
Enthusiast
hernep began at the beginning.
 
Posts: 30
Karma: 42
Join Date: Oct 2010
Location: Finland
Device: iRiver Story, iPad 2
Remove images from clearscanned PDF

Hi,

Sorry if this is too obvious to others but I am having problems with removing images from all the pages in PDF. Just strip them all off. Can this be achieved as batch job?

I know I can use Acrobat's tool to remove images one by one but this is pain when you have more than 10 pages to go. Books don't usually have less than 10 pages

Procedure is this:
1. I scan a page(s). Save it to color TIFF(s).
2. Combine TIFF(s) to single PDF.
3. Run Recognize Text. I choose Clearscan.
4. After Recognizing, clearscanned text is overlayed above Color TIFF(s)
5. I then remove "backgrounded" color TIFF(s) one by one.....

I am currently trying Adobe Acrobat X (24 days to go) and I am wondering if I am left to remove images one by one... This is actually crucial when thinking to purchase Acrobat or not. I so much love Clearscan
hernep is offline   Reply With Quote
Old 05-29-2012, 02:05 PM   #2
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
I'm not familiar with "Clearscan". Is it something that came bundled with the scanner? Um, I don't know... Maybe check the options?

For ABBYY FineReader (considered to be the best OCR-ing software around) you can find it here:
Attached Thumbnails
Click image for larger version

Name:	Untitled.png
Views:	617
Size:	42.3 KB
ID:	87073  
DSpider is offline   Reply With Quote
Old 06-02-2012, 11:34 AM   #3
hernep
Enthusiast
hernep began at the beginning.
 
Posts: 30
Karma: 42
Join Date: Oct 2010
Location: Finland
Device: iRiver Story, iPad 2
Clearscan is Adobe Acrobat technique that vectorizes fonts and creates "on-the-fly" fonts of them. Output is then very similar to original but filesize is smaller and printing is faster. Very nice when ocr'ing old books and retain original layout and feeling.

What I have been thinking that I should convert scanned images to B/W-images and the ocr them with clearscan technique. I think this will work. It seems to be easier to delete unwanted backgrounds then, because there isn't any or there only some.

I agree ABBYY is powerful for ocr'ing images to text. Downsize is that it your uses installed fonts. Output is not the same then. If you convert books to ePub os similar that wouldn't issue. I am targeting to get "Xerox copy" of scanned books. That's why ABBYY is not what I am looking for.
By the way, when I use options that you expressed I get nicely looking PDF on monitor but when I transfer it to iPad it looks horrible because of reduce images. There might be something I do wrong, though

Thank you very much for the reply!
hernep is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
remove all images ? cybmole Calibre 20 03-12-2024 07:02 AM
Remove Page Break after Images luthar28 ePub 17 04-05-2017 03:16 PM
Remove all images and Covers? Pselus Calibre 4 03-18-2015 02:47 AM
photo pdf or clearscanned pdf can be processed quicker? taylor3456 PDF 3 11-18-2011 04:59 AM
Remove Header from PDF rrosenwald Calibre 10 08-22-2009 08:36 PM


All times are GMT -4. The time now is 05:30 PM.


MobileRead.com is a privately owned, operated and funded community.