![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
OCR problem in a PDF flie
I have recently installed koreader on a Forma. When trying to lookup a word or reflow a page in a .pdf I get the error message "No OCR results or no language data..." I did install the tesseract data files, but see the same error. Is it possible that the document's language setting affects this? In Book Information Language is N/A. Can I override it somewhere?
|
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
The document language doesn't really mean anything.
A file like https://github.com/tesseract-ocr/tes...ng.traineddata should go in the /mnt/onboard/.adds/koreader/data folder. If you get that message it probably means you accidentally put it in not quite the right location. What does the full path to your file look like? |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
As far as I see it is there:
But what is even stranger the .pdf has the original text layer which is seen by nickel, but not koreader |
![]() |
![]() |
![]() |
#4 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
Ok, I missed two things:
1) A restart was needed after I installed those files 2) I had forced OCR on This solved the above problems |
![]() |
![]() |
![]() |
#5 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
However all of the above did not get me to the desired end point. I have a magazine in .pdf form with two columns of text and with reflow on I get the single column scaled to the full screen width. Am I expecting too much?
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
Hard to say, I'm not sure what else you'd expect from it? That's kind of its raison d'être.
![]() (That being said, personally I typically prefer fit to content width.) |
![]() |
![]() |
![]() |
#7 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
My idea was to have a free-flow text which adjusts to the width of the screen according to the font size.
|
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
So your complaint is that the column will only get smaller if you increase the font size but it won't get larger with a smaller font size? I guess that'd be a bug. You could see if you can manage something with k2pdfopt by itself.
|
![]() |
![]() |
![]() |
#9 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
My complaint is that the text is not treated as text. The page is treated as a collage of 2-3 images and each of those images is scaled to the width of the page. I'd expect it to treat text as text where the number of characters in a line depends on the width of the screen and font size, the same way it is handled in epub/txt
|
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
See the attachments for how it behaves for me.
|
![]() |
![]() |
![]() |
#11 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
No, I see different behavior. I see two different variants - for small font, and for default and larger (any larger size). For a font larger than 3rd setting from the left it does not scale. See a snapshot of the entire page too
|
![]() |
![]() |
![]() |
#12 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
In the future, could you please open with that?
![]() You could try playing around with it in k2pdfopt on your PC. Also is this file something that's publicly available? |
![]() |
![]() |
![]() |
#13 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 373
Karma: 557596
Join Date: May 2016
Location: Spain... is pain.. :-(
Device: Sony prs-t1, Boyue Likebook Plus (T80s), Boyue Likebook Mimas
|
Excuse me but, in my ereader "Likebook plus ( t80s ) OCR doesn't show any problem, but I can't change text size because it seems not to be enabled - it shows available sizes grey
Edit: sorry, I figured out how to enable it. Last edited by viceant; 04-19-2020 at 04:23 AM. |
![]() |
![]() |
![]() |
#14 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 519930
Join Date: Apr 2016
Device: Kobo Forma
|
Here is the publicly available fragment:
https://litportal.ru/trial/pdf/38836104.pdf Full file shows even more problems |
![]() |
![]() |
![]() |
#15 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,742
Karma: 730681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
|
That document seems to be dealt with quite well, except it's a bit of a difficult one in that the number of columns isn't stable. Just setting it to three seems fine though, also where there are only two columns.
Were you by any chance using a nightly instead of the stable? There's a reason the GH bug report template asks for that kind of information. ![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
epub 2 PDF conversion with OCR in PDF possible? | hobi2000 | Conversion | 2 | 03-25-2019 03:20 AM |
Dismiss Tiles by Add A Trigger In Sqlite Flie. | oren64 | Kobo Developer's Corner | 2 | 05-16-2016 03:53 AM |
PDF OCR output | agua102 | 7 | 07-09-2014 11:03 AM | |
PDF with OCR to MOBI | noisy | Conversion | 2 | 06-24-2013 06:14 PM |
remove OCR from a PDF? | soondai | 9 | 10-08-2011 12:42 PM |