Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 02-18-2023, 11:38 AM   #16
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 2,986
Karma: 18343081
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
I am running 64-bit linux, but because of Illegal Instruction errors, I have been running the 32-bit version of k2pdfopt-2.53. Unfortunately, the first command above fails after a dozen pages because k2pdfopt cannot allocate enough memory (it fails while trying to allocate a 273 MB buffer). The resident memory usage must have hit the limit for 32-bit programs. I have 24 GB of RAM in the system, so this is a frustrating roadblock.

The memory usage for that first command is way higher than for the command given in post #6. Is it "-ocrd p" that causes the usage to skyrocket?
rkomar is offline   Reply With Quote
Old 02-18-2023, 06:08 PM   #17
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Sent you a PM. Please check. Would like to get your doc so I can validate the memory usage.
willus is offline   Reply With Quote
Old 02-18-2023, 07:19 PM   #18
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by rkomar View Post
I am running 64-bit linux, but because of Illegal Instruction errors, I have been running the 32-bit version of k2pdfopt-2.53. Unfortunately, the first command above fails after a dozen pages because k2pdfopt cannot allocate enough memory (it fails while trying to allocate a 273 MB buffer). The resident memory usage must have hit the limit for 32-bit programs. I have 24 GB of RAM in the system, so this is a frustrating roadblock.

The memory usage for that first command is way higher than for the command given in post #6. Is it "-ocrd p" that causes the usage to skyrocket?
Try adding -nt 1. This will queue up only 1 image at a time for OCR. Otherwise k2pdfopt tries to queue up multiple images so that it can perform multi-threaded OCR, which is faster, but consumes much more memory. BTW, even with -nt 8, k2pdfopt used at most 1.5 GB RAM on my system. Seems strange it should run out if you have 24 GB available.

On a virtual Fedora 37 machine with 16 GB RAM, with k2pdfopt v2.54, I get the following results from this command:

k2pdfopt -nt <XX> -mode copy -dpi 600 -ocr t -ocrd p src.pdf

32-bit, <XX> = 8 failed on page 13 trying to allocate a 1-GB bitmap
32-bit, <XX> = 4 same result as above
32-bit, <XX> = 2 completed successfully
64-bit, <XX> = 8 completed successfully (consumed up to 5.5 GB during the run)

With k2pdfopt v2.53:

64-bit fails in Fedora 37 because it was compiled on an earlier Linux kernel.
32-bit, <XX> = 8 and 4 fails on page 13 trying to allocate a 270-MB image
32-bit, <XX> = 2 fails on page 14
32-bit, <XX> = 1 fails on page 17

There is a known memory leak issue in v2.53. See the fix in v2.54.

Last edited by willus; 02-18-2023 at 08:13 PM.
willus is offline   Reply With Quote
Old 02-18-2023, 09:02 PM   #19
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
One note--you don't really need to use 600 dpi for the first command even if your source document is at 600 dpi. The purpose of the first command is solely to get an accurate OCR conversion, and with a typical font size of 10-12 points, Tesseract seems to be the most accurate right around 300 dpi. As the dpi is increased it actually becomes slightly less accurate.

Last edited by willus; 02-19-2023 at 03:03 PM. Reason: Updated Tesseract accuracy link
willus is offline   Reply With Quote
Old 02-18-2023, 09:02 PM   #20
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 2,986
Karma: 18343081
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
Yes, that was the problem. I don't know why I was running 2.53 when I went to the download page less than a week ago, but somehow I ended up with that (most likely my fault). Version 2.54 works well for me. Thanks for figuring out what I did wrong.
rkomar is offline   Reply With Quote
Old 02-18-2023, 09:04 PM   #21
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
The k2pdfopt download page was temporarily, incorrectly not showing the v2.54 version. It has been fixed.
willus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to turn multiple jpg images into 1 pdf ebook file DawnDawn88 Conversion 12 01-19-2024 04:21 AM
PDF -> JPG -> CBZ -> LRF leveck Workshop 13 06-16-2011 11:21 AM
Entourage Edge and JPG's to PDF files xander enTourage Archive 23 04-04-2011 06:53 PM
DR800 Convert PDF to JPG for faster loading speed? bokjeid iRex 1 07-24-2010 09:32 AM
Doubts about Kobo - jpg converted to pdf, and some smaller issues... mig_akira Kobo Reader 9 06-10-2010 06:11 PM


All times are GMT -4. The time now is 01:31 AM.


MobileRead.com is a privately owned, operated and funded community.