04-03-2013, 05:53 AM | #376 |
Junior Member
Posts: 4
Karma: 10
Join Date: Apr 2013
Location: france
Device: koboglo
|
thank you for four quick answer.
I discover the "crop box" .., explanation is clear. for this document (no column) , I can use "calibre" to convert pdf to epub. ( k2pdf is bettterfor some details..) but calibre can't convert multicolumn documents.. I'll see what hapens, next time, with a multi-column doc... |
04-03-2013, 10:09 PM | #377 |
Wizard
Posts: 2,607
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi
A question about internal margins. Thanks for a wonderful software. I am a Linux user (64bits) and I have a KoboGlo. I tried k2pdfopt on one of my own PDF (just for a ten pages trial). There is a header. I used the following command: Code:
/opt/k2pdfopt brune.pdf -w 748 -h 1024 -odpi 213 -m 0.5 -p 10-20 I would like my text to be centered, that is to have a 5px margin on both sides. I learnt there is an -omb parameter but I could get no result out of it. What would be your advice to improve the left and right margin display of this text on the Kobo? Last edited by roger64; 04-03-2013 at 10:18 PM. |
Advert | |
|
04-04-2013, 12:35 AM | #378 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
/opt/k2pdfopt brune.pdf -w 758 -h 1024 -odpi 213 -m 0.5 -oml 0.025 -omr 0.025 The -oml and -omr set the left and right output device margins in inches. |
|
04-04-2013, 03:33 AM | #379 |
Wizard
Posts: 2,607
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
@willus
Thanks for your explanation for computing the text margins. k2pdfopt is an impressive and amazingly precise tool. The conversion of a standard book like mine may need two kind of commands, one for the cover page and for the other pages without margins) that may exist, another for normal pages with margins. So we may have several output files for one book. I can use pdfsam to merge these output files. Last edited by roger64; 04-04-2013 at 03:36 AM. |
04-04-2013, 08:36 AM | #380 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
Advert | |
|
04-04-2013, 05:41 PM | #381 |
Connoisseur
Posts: 71
Karma: 18500
Join Date: Apr 2013
Device: Kindle Touch, Paperwhite
|
Bilingual texts?
Is there any way for OCRing multiple language pages for example a dictionary page which is (usually) biligual? I don't have any idea if Tesseract allows doing this so it might be impossible to achieve..
|
04-04-2013, 11:08 PM | #382 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
Note that to see the Russian characters correctly, you need to copy and paste the Russian PDF page into a unicode-aware application (like the google translate box in a modern browser). K2pdfopt does not use the correct Cyrillic font. The commands I used were: k2pdfopt -mode copy -ocr t -ocrvis t multilingual.pdf -ocrlang eng -o multi_eng.pdf k2pdfopt -mode copy -ocr t -ocrvis t multilingual.pdf -ocrlang rus -o multi_rus.pdf Last edited by willus; 04-04-2013 at 11:14 PM. |
|
04-06-2013, 01:31 PM | #384 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
04-07-2013, 06:51 AM | #385 | |
Junior Member
Posts: 2
Karma: 10
Join Date: Mar 2013
Device: Sony PRS-T1
|
Quote:
-m 0 -col 1 -fc- -wrap- to prevent any wrapping / layout changes or text resizing. It worked pretty good since actually it fits quite well with the reader (wide) size in terms of text size (so just dumb luck basically + switching to landscape orientation). I will try your solution to see how that works out. Thanks again for your help! Last edited by Kornholio; 04-07-2013 at 07:00 AM. |
|
04-07-2013, 09:01 AM | #386 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
-m 0 -mode fw The -mode fw is a shortcut for several options. See my command-line options help page for the details. (Actually, you don't need -m 0 anymore with v1.65. It's now the default.) If you didn't try the above command, you should try it. It's a good solution if you don't need text re-flow. |
|
04-07-2013, 03:03 PM | #387 |
Enthusiast
Posts: 29
Karma: 81500
Join Date: Apr 2013
Device: Kindle 4
|
Hi all,
just wanted to let you know that I have also updated my Windows GUI for k2pdfopt with a few of the new options of k2pdfopt, most important the OCR functions. The GUI contains links to all Tessaract training files, so downloading them is pretty easy. The respective environment variable is set by the GUI, you only have to specify the path where you have extracted the language files. I did not want to implement the Download and Extraction procedure into the GUI due to possible safety concerns users might have ("Why does that program connect to the internet?!"), so that part is handled by your trusted browser. ;-) The new version 1.04.1 is available at my homepage Great work with the updates Willus, thank you once more |
04-08-2013, 02:26 PM | #388 |
Enthusiast
Posts: 30
Karma: 2848
Join Date: Feb 2013
Location: Lithuania
Device: Kobo Glo
|
I have a problem with a three-column text. Inside one page an image is located in a way to overlap the two columns. And the program does not read the page correctly - it does not recognize the text as three column. Now, it renders the second page fine. Now, if I cut the image out by specifying 3,4" bottom margin, the columns get recognized (although the lines separating the columns does not get ignored, which is a mino problem, though).
could something be done about pages like this, or is it just too much play? Here is the file I have a problem with: http://www.nzidinys.lt/files/various...iene%20txt.pdf Last edited by dgvirtual; 04-08-2013 at 03:02 PM. |
04-08-2013, 10:27 PM | #389 | |
Fuzzball, the purple cat
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
k2pdfopt -col 4 -cgr .4 -evl 1 -sm -mb 1.1 -ch 0.5 Az.pdf -col 4 enables detection of up to 4 columns (2 levels of recursion). -cgr .4 limits the horizontal search range for the column divider. The value of .4 gets k2pdfopt to treat the left column divider as the first divider, which is the key to correct layout on page 1. -evl 1 erases the vertical lines, which helps k2pdfopt find the column dividers. -sm shows you how k2pdfopt is flowing your document (in the ..._marked.pdf file). You can take that out on the final conversion since it slows things down considerably. -mb 1.1 ignores the page numbers / footer on the bottom of each page by cropping off the bottom 1.1 inches from each source page. -ch 0.5 allows regions as short as 0.5 inches in height to be separated into multiple columns, which is important for page 1 (the default is 1.5 inches). |
|
04-13-2013, 09:07 AM | #390 |
Junior Member
Posts: 8
Karma: 8900
Join Date: Jan 2013
Device: kindle 4 nt
|
|
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |