Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 12-18-2009, 03:23 AM   #1
ficbot
Wizard
ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.
 
Posts: 2,409
Karma: 4132096
Join Date: Sep 2008
Device: Kindle Paperwhite/iOS Kindle App
Any options besides PDF for mixed language documents?

I picked up a cheap scanner today and I am disappointed. I tried scanning a paperback book in English, and it did a terrible job, lots of weird symbols all over the place. Then I tried the teaching guides which are the main reason I wanted the scanner. What a mess! It seems the problem is that the text is half in French and half in English (e.g. it has prompts in English telling you what to say in French to the kids, for example "say 'je suis ici' while pointing at yourself.") So when I set the scanner to OCR mode and the language was English, I got gibberish. When I set it to French, things improved a little and it got much of it, but the text still needed a lot of cleaning up.

I thought maybe it was just that the software which came with the scanner was not that great. So I downloaded a few utilities which claim to extract text from PDFs. They had great reviews. They totally choked on the French parts.

The PDF looks fine (I made a two-page sampler for testing purposes), but displays a bit too small for easy reading on the Sony. I uploaded it as a PDF, LRF and epub separately. The epub could not zoom at all (i.e. the page stayed looking the same no matter what). The LRF looked just like the PDF on lowest zoom but when I tried to zoom in, the text got garbled as it had when I tried to extract it from the PDF.

So, there are three possibilities here:

1) The scanner is not that great
2) The scanner is fine and I just need better software
3) Dual-language files are too hard and I am stuck with PDF

What do you think? Is there anything I can do here, or will I go to all this work just to wind up with itty bitty text in a PDF file? If so, it may not be worth scanning them all...
ficbot is offline   Reply With Quote
Old 12-19-2009, 06:06 PM   #2
Jack Tingle
Punctuation Fetishist
Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.Jack Tingle ought to be getting tired of karma fortunes by now.
 
Jack Tingle's Avatar
 
Posts: 557
Karma: 1070000
Join Date: Nov 2008
Location: The Bluest Commonwealth In East America
Device: Kindle PW, Nexus 7 (2013), Galaxy S5 phone, Galaxy Tab 4 8.0
Probably 3. Maybe 2. All the scanner does is make a picture, so if you can see the picture, 1 is not it.

OCR software tries to recognize characters from the picture, and then turn them into words from a dictionary. You need two dictionaries, and some kind of referee to tell the OCR which one to use. There may not be such a thing.

Good Luck,
Jack Tingle
Jack Tingle is offline   Reply With Quote
Advert
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF conversion options tomsem Calibre 0 05-04-2010 07:22 PM
PDF zoom options too coarse jusmee Astak EZReader 11 03-12-2010 10:41 PM
best foreign language & dictionary options? joedevivre Which one should I buy? 2 12-13-2009 09:40 AM
Advanced options for PDF files npavkovic Sony Reader Dev Corner 5 02-22-2008 12:53 AM
iLiad Mobipocket problem with documents in Korean language. wagnerian iRex Developer's Corner 0 07-14-2007 03:49 PM


All times are GMT -4. The time now is 10:28 AM.


MobileRead.com is a privately owned, operated and funded community.