|
|
Thread Tools | Search this Thread |
03-06-2010, 08:15 AM | #1 |
Wizard
Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
Any way to open a PDF in ABBYY 9.0 without actually processing the pages?
... or just define the area template that must be used for the book before opening and processing?
I would like to avoid processing twice; one when opening the book, second when loading my area template. |
03-06-2010, 07:39 PM | #2 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
I don't know the answer but I'd like to know what's an area template.
I have been puzzled recently by FineReader 9 when processing image PDFs of old books (coming from Internet archives). I was getting awful results because the "reading" (OCR) process divided the surface of the page to be recognized a little at random, whith some times overlapping zones. So, I extracted 50 pages out of a image PDF book, let it do its dirty work, and then decided to do it again manually: suppressing all the zones, drawing a text wrapper zone around the page. The result was excellent, of course at the price of lot of work. For these 50 pages, "manual" processing speed was bout two pages a minute... So, you guess why your "area template" looks tempting to me... |
Advert | |
|
03-07-2010, 02:59 AM | #3 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Well, I just found the solution in the Fine FineReader manual.
It's called "Modèle de zone" in French and I successfully used it to "read" a document (all pages selected). We select it from menu "zone" I am now like you, when I open a document (image PDF), he begins to numerize and read it. Up to now, I have no other solution than to accept this behaviour. Once finished, instead of saving, I just load the area template (file with blk), make sure all the pages of the document are selected, and than order again "read"... After that, I can save much improved results... It could have saved me some hours...a few days ago... |
03-07-2010, 05:52 AM | #4 |
Wizard
Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
Glad you figured it out It's a good way to cut out page headers and page numbers. But now you can see why it would be nice if the document was only read once. BTW, did you kow that you can 'train' the character recognition? It helped me a lot when I had a scan of an old paperback with small print.
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Classic Split PDF pages into smaller pages (images into tiles) | Astro | Barnes & Noble NOOK | 4 | 06-12-2020 10:56 AM |
[Old Thread] Removing ABBYY header in a PDF | robertlc | Conversion | 33 | 09-09-2011 12:12 AM |
PDF to Epub (problem with pages) | violentlyserene | Calibre | 1 | 08-22-2010 10:38 AM |
Ignore Headers & Footers in PDF when scanning in ABBYY | PieOPah | Workshop | 5 | 08-28-2009 01:55 AM |
Strikethrough in ABBYY/PDF | eurotrash | Workshop | 5 | 10-29-2008 01:44 PM |