Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > KOReader

Notices

Reply
 
Thread Tools Search this Thread
Old 04-09-2020, 08:42 AM   #1
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
OCR problem in a PDF flie

I have recently installed koreader on a Forma. When trying to lookup a word or reflow a page in a .pdf I get the error message "No OCR results or no language data..." I did install the tesseract data files, but see the same error. Is it possible that the document's language setting affects this? In Book Information Language is N/A. Can I override it somewhere?
mzel is offline   Reply With Quote
Old 04-09-2020, 02:22 PM   #2
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
The document language doesn't really mean anything.

A file like https://github.com/tesseract-ocr/tes...ng.traineddata should go in the /mnt/onboard/.adds/koreader/data folder. If you get that message it probably means you accidentally put it in not quite the right location.

What does the full path to your file look like?
Frenzie is offline   Reply With Quote
Advert
Old 04-09-2020, 08:08 PM   #3
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
As far as I see it is there:
But what is even stranger the .pdf has the original text layer which is seen by nickel, but not koreader
Attached Thumbnails
Click image for larger version

Name:	Annotation 2020-04-09 200347.png
Views:	26
Size:	63.3 KB
ID:	178218  
mzel is offline   Reply With Quote
Old 04-09-2020, 09:08 PM   #4
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
Ok, I missed two things:
1) A restart was needed after I installed those files
2) I had forced OCR on
This solved the above problems
mzel is offline   Reply With Quote
Old 04-09-2020, 09:19 PM   #5
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
However all of the above did not get me to the desired end point. I have a magazine in .pdf form with two columns of text and with reflow on I get the single column scaled to the full screen width. Am I expecting too much?
mzel is offline   Reply With Quote
Advert
Old 04-10-2020, 01:33 PM   #6
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
Hard to say, I'm not sure what else you'd expect from it? That's kind of its raison d'être.

(That being said, personally I typically prefer fit to content width.)
Frenzie is offline   Reply With Quote
Old 04-11-2020, 10:50 AM   #7
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
My idea was to have a free-flow text which adjusts to the width of the screen according to the font size.
mzel is offline   Reply With Quote
Old 04-11-2020, 12:12 PM   #8
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
So your complaint is that the column will only get smaller if you increase the font size but it won't get larger with a smaller font size? I guess that'd be a bug. You could see if you can manage something with k2pdfopt by itself.
Frenzie is offline   Reply With Quote
Old 04-12-2020, 12:33 PM   #9
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
My complaint is that the text is not treated as text. The page is treated as a collage of 2-3 images and each of those images is scaled to the width of the page. I'd expect it to treat text as text where the number of characters in a line depends on the width of the screen and font size, the same way it is handled in epub/txt
mzel is offline   Reply With Quote
Old 04-12-2020, 01:47 PM   #10
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
See the attachments for how it behaves for me.
Attached Thumbnails
Click image for larger version

Name:	Screenshot_20200412_194455.png
Views:	37
Size:	95.5 KB
ID:	178333   Click image for larger version

Name:	Screenshot_20200412_194459.png
Views:	35
Size:	133.7 KB
ID:	178334  
Frenzie is offline   Reply With Quote
Old 04-15-2020, 03:11 PM   #11
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
No, I see different behavior. I see two different variants - for small font, and for default and larger (any larger size). For a font larger than 3rd setting from the left it does not scale. See a snapshot of the entire page too
Attached Thumbnails
Click image for larger version

Name:	Reader_2020-Apr-15_110327.png
Views:	21
Size:	835.8 KB
ID:	178413   Click image for larger version

Name:	Reader_2020-Apr-15_110609.png
Views:	20
Size:	138.7 KB
ID:	178414   Click image for larger version

Name:	Reader_2020-Apr-15_110723.png
Views:	23
Size:	366.4 KB
ID:	178415   Click image for larger version

Name:	Reader_2020-Apr-15_110654.png
Views:	18
Size:	447.7 KB
ID:	178416  
mzel is offline   Reply With Quote
Old 04-15-2020, 03:48 PM   #12
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
In the future, could you please open with that?

You could try playing around with it in k2pdfopt on your PC. Also is this file something that's publicly available?
Frenzie is offline   Reply With Quote
Old 04-19-2020, 02:44 AM   #13
viceant
Addict
viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.viceant ought to be getting tired of karma fortunes by now.
 
Posts: 257
Karma: 549458
Join Date: May 2016
Location: Spain... is pain.. :-(
Device: Sony prs-t1, Boyue Likebook Plus (T80s)
Excuse me but, in my ereader "Likebook plus ( t80s ) OCR doesn't show any problem, but I can't change text size because it seems not to be enabled - it shows available sizes grey

Edit: sorry, I figured out how to enable it.

Last edited by viceant; 04-19-2020 at 04:23 AM.
viceant is online now   Reply With Quote
Old 04-19-2020, 06:34 PM   #14
mzel
Member
mzel began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2016
Device: Kindle PW2
Here is the publicly available fragment:
https://litportal.ru/trial/pdf/38836104.pdf
Full file shows even more problems
mzel is offline   Reply With Quote
Old 04-20-2020, 06:06 AM   #15
Frenzie
Guru
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 976
Karma: 330014
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
That document seems to be dealt with quite well, except it's a bit of a difficult one in that the number of columns isn't stable. Just setting it to three seems fine though, also where there are only two columns.

Were you by any chance using a nightly instead of the stable? There's a reason the GH bug report template asks for that kind of information. Because if so you need the 2020.04.1 release to prevent some minor cache mismatch issues that were present for a couple of weeks.
Frenzie is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
epub 2 PDF conversion with OCR in PDF possible? hobi2000 Conversion 2 03-25-2019 03:20 AM
Dismiss Tiles by Add A Trigger In Sqlite Flie. oren64 Kobo Developer's Corner 2 05-16-2016 03:53 AM
PDF OCR output agua102 PDF 7 07-09-2014 11:03 AM
PDF with OCR to MOBI noisy Conversion 2 06-24-2013 06:14 PM
remove OCR from a PDF? soondai PDF 9 10-08-2011 12:42 PM


All times are GMT -4. The time now is 12:32 PM.


MobileRead.com is a privately owned, operated and funded community.