View Single Post
Old 09-15-2007, 06:49 AM   #127
ereszet
Zealot
ereszet has a complete set of Star Wars action figures.ereszet has a complete set of Star Wars action figures.ereszet has a complete set of Star Wars action figures.ereszet has a complete set of Star Wars action figures.
 
ereszet's Avatar
 
Posts: 118
Karma: 306
Join Date: Sep 2007
Device: Sony PRS-500 Archos 704 wifi
I have followed all your pdflrf releases with growing amazement of what you have achieved and how soon you responded to new demands and challenges. Apart from all the options that pdflrf offers, it is extremely fast. Believe me, I have tried scores of different programs/utilities (DOS/Windows/Ubuntu) to process pdf/djvu photos of old books (like Google books) before OCR-ing them with Finereader and none is even close to your program. Thank you.
And now is my humble suggestion. Can you include pdf as an output? Sony Reader is only one of many toys for reading books while pdf format is universal. It would be so useful to have pdf files readibility improved before OCR-ing them and storing them in my laptop library or reading with Archos 704 (I just ordered it and hope that 7" screen will make a difference to Sony's 6").
For your info, my workflow before discovering pdflrf was: 1. reading page images or pdfs to Finereader, 2. recognizing blocks of text/images, 3. saving images with blocks only (no white space surrounding it), 4. reading images back to Finereader, 5. OCR-ing, 6. saving to pdfs (text under image). Of course the original page images require a lot of cleaning before going to Finereader, because otherwise all black margins or blobs would be recognized as blocks and prevent removal of white space surronding the text.
ereszet is offline