Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 03-24-2009, 05:50 PM   #1
Student1
Groupie
Student1 doesn't litterStudent1 doesn't litter
 
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
ABBYY Finereader and text formating

Hi,

always seems to turn to you guys for the best info regarding editing/conversions/books , have another one to lay on you if you have some experience on finereader 9 pro.

Seems each time i try to ocr a pdf and use the formating options all the paragraphs get mixed, well alot of them. So my next option is to use exact copy. Thats great but then if i save to html it gives me some horrible results when porting to lrf or epub. So next step i tried is to convert to doc. Then in doc i get those horrible text boxes. Tried to ctrl-a to select all but nothing happens as all the text in the boxes. Saving to txt yeld to results.

So anyone has any idea how to properly either get formated text from finereader without loosing paragraph order or remove all the boxes in word so i can then copy to html and later on convert to epub.

any ideas are welcomed!


thanks!
Student1 is offline   Reply With Quote
Old 03-24-2009, 06:16 PM   #2
slayda
Retired & reading more!
slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.slayda ought to be getting tired of karma fortunes by now.
 
slayda's Avatar
 
Posts: 2,743
Karma: 884247
Join Date: Sep 2006
Location: North Alabama, USA
Device: Kindle 1, iPad 4, iPhone 5
I use Finereader 9 Pro and have seen the text boxes you refer to but only with PDFs that I got from somewhere else. Most of what I do is with PDFs from my scansnap scanner. They work fine. One possible suggestion - you might try converting the PDF pages to an image format (e.g. JPEG) and input the images to Finereader. I have a program called "PDF to Image Converter" that I've used. You can get more info about it here.

I don't know if that will help. Where I've had the most problems with the text boxes is with brochures I've downloaded & OCRed. They are very frustrating.

Good luck.
slayda is offline   Reply With Quote
Old 03-24-2009, 07:31 PM   #3
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,140
Karma: 24387938
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Clié; PRS-505; EZR Pocket Pro, PRS-600, Kobo Mini
It helps to manually zone the Finereader batch; if it's zoning the pages with text boxes, they'll be text boxes in a Word doc, but if it's all zoned as one big block, that should be standard text on the page.
Elfwreck is offline   Reply With Quote
Old 03-24-2009, 11:12 PM   #4
Student1
Groupie
Student1 doesn't litterStudent1 doesn't litter
 
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
Quote:
Originally Posted by slayda View Post
I use Finereader 9 Pro and have seen the text boxes you refer to but only with PDFs that I got from somewhere else. Most of what I do is with PDFs from my scansnap scanner. They work fine. One possible suggestion - you might try converting the PDF pages to an image format (e.g. JPEG) and input the images to Finereader. I have a program called "PDF to Image Converter" that I've used. You can get more info about it here.

I don't know if that will help. Where I've had the most problems with the text boxes is with brochures I've downloaded & OCRed. They are very frustrating.

Good luck.
Thanks, i usually do that from no ocr scans, but i m doing this to convert to epub. So something like pdfread is of no use in this case.
Student1 is offline   Reply With Quote
Old 03-24-2009, 11:13 PM   #5
Student1
Groupie
Student1 doesn't litterStudent1 doesn't litter
 
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
Quote:
Originally Posted by Elfwreck View Post
It helps to manually zone the Finereader batch; if it's zoning the pages with text boxes, they'll be text boxes in a Word doc, but if it's all zoned as one big block, that should be standard text on the page.
That might be an idea, i ll check with the settings, zoning everything in one block would work !
Student1 is offline   Reply With Quote
Old 03-24-2009, 11:20 PM   #6
Student1
Groupie
Student1 doesn't litterStudent1 doesn't litter
 
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
Can't seem to find anything to read in one block. Analyse will always select the read regions. And formated text does a horrible job at mixing all the pragraph... dont undetand shouldn t be too hard to see that what is scans first doesn t come after what is scanned second... weird bug!
Student1 is offline   Reply With Quote
Old 12-15-2011, 07:37 PM   #7
linnx88
Enthusiast
linnx88 began at the beginning.
 
Posts: 27
Karma: 10
Join Date: Jul 2011
Device: Kindle Paperwhite
Quote:
Originally Posted by Student1 View Post
Can't seem to find anything to read in one block. Analyse will always select the read regions. And formated text does a horrible job at mixing all the pragraph... dont undetand shouldn t be too hard to see that what is scans first doesn t come after what is scanned second... weird bug!
Student,

What you do is select the "Formatted Text" layout in the top bar. I have ABBYY FineReader 11 Pro. So i'm not sure if your version has it. But version 11 is simply amazing. I selected that, and it got rid of all the stupid boxes and put it in a nice flat format, and it allows you to convert this file straight to EPUB and it even has an option to send to Kindle (through email)!!

I got a low quality image PDF about PHP and SQL programming converted with VERY little errors, which were easy to correct!

Last edited by linnx88; 12-15-2011 at 09:03 PM.
linnx88 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ABBYY FineReader - Proof reading tips? PieOPah Workshop 23 03-02-2012 02:03 AM
ABBYY Finereader - Possible to command line/auto convert? tessel Workshop 3 04-06-2011 12:08 PM
Abbyy FineReader Dictionaries Mebyon Workshop 2 02-10-2010 03:57 PM
ABBYY FineReader cannot see images chinesealbumart Workshop 8 05-16-2009 12:03 AM
Ended wanted: coupon code for Abbyy finereader moz Flea Market 1 03-12-2008 03:10 AM


All times are GMT -4. The time now is 08:57 PM.


MobileRead.com is a privately owned, operated and funded community.