Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 12-31-2018, 12:33 PM   #1636
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by msh2050 View Post
Dear willus
thanks for help it worked now... thats is great...
Please note that I try to run the ocr but the program close as it start to use the tesseract ... (I try with/without GUI) and I try 64-32 and the old cpu versions

please find attachment of the cmd error that I get before it close.

regards
Thanks. I'll post a fix soon. It has to do with the Tesseract v4 code not being especially resilient when compiling/detecting SSE/AVX capability.
willus is offline   Reply With Quote
Old 12-31-2018, 10:18 PM   #1637
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by willus View Post
Thanks. I'll post a fix soon. It has to do with the Tesseract v4 code not being especially resilient when compiling/detecting SSE/AVX capability.
I have posted some beta Windows builds [link removed since release of v2.51]. I'd appreciate if somebody could try them on a relatively modern PC with Tesseract OCR and let me know (1) if the OCR works and (2) what the header says (SSE? AVX?).

Last edited by willus; 01-04-2019 at 11:40 PM. Reason: Removed link
willus is offline   Reply With Quote
Advert
Old 01-01-2019, 10:48 AM   #1638
axet
Junior Member
axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
Hello!

I did not expect new release, otherwise, I would inform you about another issue with source code.

Problem happens sometime, when here is only one element recognized on text line (happens for titles and right aligned epigraphs) then 'wrmap' not aligned properly. As result text detected correctly, and page formed normally, but when I ask for back coordinates (original coordinates on source image) I got wrong results. It happens because 'wrmap' malformed during parsing.

Check this fix:

* https://gitlab.com/axet/android-k2pd...33f1ae7ec17540
axet is offline   Reply With Quote
Old 01-01-2019, 10:58 AM   #1639
axet
Junior Member
axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
It would be even better if you add Android logging support:

* https://gitlab.com/axet/android-k2pd...82164e7d85f09b
axet is offline   Reply With Quote
Old 01-01-2019, 10:59 AM   #1640
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by axet View Post
Hello!

I did not expect new release, otherwise, I would inform you about another issue with source code.

Problem happens sometime, when here is only one element recognized on text line (happens for titles and right aligned epigraphs) then 'wrmap' not aligned properly. As result text detected correctly, and page formed normally, but when I ask for back coordinates (original coordinates on source image) I got wrong results. It happens because 'wrmap' malformed during parsing.

Check this fix:

* https://gitlab.com/axet/android-k2pd...33f1ae7ec17540
Thanks--I'll take a look before the next release.
willus is offline   Reply With Quote
Advert
Old 01-01-2019, 03:40 PM   #1641
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by axet View Post
Hello!

I did not expect new release, otherwise, I would inform you about another issue with source code.

Problem happens sometime, when here is only one element recognized on text line (happens for titles and right aligned epigraphs) then 'wrmap' not aligned properly. As result text detected correctly, and page formed normally, but when I ask for back coordinates (original coordinates on source image) I got wrong results. It happens because 'wrmap' malformed during parsing.

Check this fix:

* https://gitlab.com/axet/android-k2pd...33f1ae7ec17540
Can you post an example source PDF and the conversion settings where this fix made a difference? I'd like a test case.
willus is offline   Reply With Quote
Old 01-02-2019, 07:59 AM   #1642
axet
Junior Member
axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
You can choice any pdf with centered or right aligment text. The one I'm using is DVJU (hope russian and djvu it is not an issue) https://drive.google.com/open?id=1rH...2pJAFuKN-y9men

Page 2 has title text and ISBN number center and right aligned. When you parse this page, and request coordinates for title (Бэтман Аполло) and isbn (ISBN 978-5-699-63446 -0) text lines it will return left aligned coodinates.

If user click on screen using those coordinates it will not select text on source image properly because coordinates are incorrect.

Convert settins default, and screen size = android screen size, not important, any small screen will suffice 1080x1920
Attached Thumbnails
Click image for larger version

Name:	Screenshot_1546433541.png
Views:	254
Size:	234.1 KB
ID:	168784  
axet is offline   Reply With Quote
Old 01-02-2019, 08:22 AM   #1643
axet
Junior Member
axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
I can give you an visual example of image regions, you can see, center and right aligned regions misspositioned, with fix everying seems normal.
axet is offline   Reply With Quote
Old 01-02-2019, 08:23 AM   #1644
axet
Junior Member
axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'axet understands when you whisper 'The dog barks at midnight.'
 
Posts: 6
Karma: 42208
Join Date: Feb 2018
Device: android phone
images attached
Attached Thumbnails
Click image for larger version

Name:	111.png
Views:	250
Size:	185.3 KB
ID:	168787   Click image for larger version

Name:	222.png
Views:	250
Size:	184.7 KB
ID:	168788  
axet is offline   Reply With Quote
Old 01-02-2019, 08:31 AM   #1645
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by axet View Post
You can choice any pdf with centered or right aligment text. The one I'm using is DVJU (hope russian and djvu it is not an issue) https://drive.google.com/open?id=1rH...2pJAFuKN-y9men

Page 2 has title text and ISBN number center and right aligned. When you parse this page, and request coordinates for title (Бэтман Аполло) and isbn (ISBN 978-5-699-63446 -0) text lines it will return left aligned coodinates.

If user click on screen using those coordinates it will not select text on source image properly because coordinates are incorrect.

Convert settins default, and screen size = android screen size, not important, any small screen will suffice 1080x1920
Axet--can you please send me a private message with your e-mail address? I'd like to have further discussion but don't want to clutter the thread.
willus is offline   Reply With Quote
Old 01-02-2019, 12:07 PM   #1646
msh2050
Enthusiast
msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.msh2050 is often consulted by the I Ching.
 
Posts: 27
Karma: 122330
Join Date: Sep 2017
Device: ipad , Kindle PW3
Quote:
Originally Posted by willus View Post
I have posted some beta Windows builds. I'd appreciate if somebody could try them on a relatively modern PC with Tesseract OCR and let me know (1) if the OCR works and (2) what the header says (SSE? AVX?).
Dear willus ..
it is working now ..
I check the ocr in all editions(64 -32-32g)

regards
Mustafa
msh2050 is offline   Reply With Quote
Old 01-02-2019, 09:20 PM   #1647
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by msh2050 View Post
Dear willus ..
it is working now ..
I check the ocr in all editions(64 -32-32g)

regards
Mustafa
Thank you. I'll post an official release after resolving the issue with axet.
willus is offline   Reply With Quote
Old 01-03-2019, 04:13 AM   #1648
abe1
Junior Member
abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.abe1 invented the internet.
 
Posts: 9
Karma: 84406
Join Date: Jan 2019
Device: Kindle 5 (2012)
What am I doing wrong?



Pdf Link: https://drive.google.com/file/d/1wuR...ew?usp=sharing

Any help is appreciated
abe1 is offline   Reply With Quote
Old 01-03-2019, 08:41 AM   #1649
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by abe1 View Post
What am I doing wrong?

Pdf Link: https://drive.google.com/file/d/1wuR...ew?usp=sharing

Any help is appreciated
See the k2pdfopt FAQ, fourth from last question: "Sometimes I get multiple rows of text at smaller magnification than the rest of the document. Why?" The bottom line is you'll want to put something like:

-gtr 0.1

In the "Additional Options" box. This will encourage k2pdopt to break lines apart more readily. You may have to adjust the number larger or smaller. The default is 0.006. Larger gives more encouragement to break apart the lines.

Last edited by willus; 01-03-2019 at 08:44 AM.
willus is offline   Reply With Quote
Old 01-04-2019, 04:56 PM   #1650
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
k2pdfopt v2.51 released

K2pdfopt v2.51 is released. This fixes an issue in v2.50 where the Tesseract OCR would not run on modern PCs and enhances the accuracy of the Tesseract v4.0.0 OCR. See details at the web site.
willus is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 02:37 AM.


MobileRead.com is a privately owned, operated and funded community.