Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 08-28-2012, 09:37 AM   #136
sgtmac
Junior Member
sgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheesesgtmac can extract oil from cheese
 
Posts: 1
Karma: 1000
Join Date: Aug 2012
Location: Rancho Cucamonga CA
Device: ipad 3-Kindel Fire-Kindel first edition
I'm Sgt Mac. Newbe. This sight looks like it will be very informative.
sgtmac is offline   Reply With Quote
Old 08-28-2012, 09:46 AM   #137
knc1
Going Viral
knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.
 
knc1's Avatar
 
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
Quote:
Originally Posted by sgtmac View Post
I'm Sgt Mac. Newbe. This sight looks like it will be very informative.
Be certain to explore our index system:
https://wiki.mobileread.com/wiki/Prefix_Index
Far from complete, but it does index a lot of useful stuff.
knc1 is offline   Reply With Quote
Old 08-28-2012, 09:51 AM   #138
geekmaster
Carpe diem, c'est la vie.
geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.geekmaster ought to be getting tired of karma fortunes by now.
 
geekmaster's Avatar
 
Posts: 6,433
Karma: 10773668
Join Date: Nov 2011
Location: Multiverse 6627A
Device: K1 to PW3
Quote:
Originally Posted by knc1 View Post
Be certain to explore our index system:
https://wiki.mobileread.com/wiki/Prefix_Index
Far from complete, but it does index a lot of useful stuff.
@sgtmac: If you forget the URL, you can find it in the Master Index sticky thread at the top of this Kindle Developer's Corner forum.
geekmaster is offline   Reply With Quote
Old 09-03-2012, 12:26 AM   #139
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Merged k2pdfopt threads

I apologize if this inconveniences folks. I asked about putting up a cross-sticky to this thread in the PDF forum, and the solution decided on by the moderator ended up being to merge the two main k2pdfopt threads, with the merged thread residing in this PDF forum, which I do think makes the most sense. I do want to recognize WangMinh12, who started the k2pdfopt thread in the Kindle Development forum where it saw a lot of activity over the last year. Thank you, WangMinh12.
willus is offline   Reply With Quote
Old 09-03-2012, 08:55 AM   #140
twobob
( ͡° ͜ʖ ͡°){ʇlnɐɟ ƃǝs}Týr
twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.twobob ought to be getting tired of karma fortunes by now.
 
twobob's Avatar
 
Posts: 6,586
Karma: 6299991
Join Date: Jun 2012
Location: uti gratia usura (Yao ying da ying; Mo ying da yieng)
Device: PW-WIFI|K5-3G+WIFI| K4|K3-3G|DXG|K2| Rooted Nook Touch
Used this product on a couple of technical pdfs that were a complete nightmare on the kindle.

Aside from a couple of gotchas (hanging indents, I'm looking at you) that could all be fixed with enough fiddling about with flags, it worked perfectly.

So 100% support on this, glad you got what you wanted in the end. and yes. Thanks WangMinh12 - the product has not gone unnoticed.
twobob is offline   Reply With Quote
Old 09-05-2012, 05:03 PM   #141
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
First of all many thanks to Willus on this great application.

I read a lot of scanned pdf's on 6" eink readers and k2pdfopt has been my main weapon for about year now.

Before k2pdfopt I used Bris, Pdfscissors, Abbyy Finereader 10, Scantaylor etc. to crop margins and then print out such cropped pdf in Adobe Acrobat in Tile page mode (now also possible in Adobe Reader as Posters mode) for landscape dimension (about 120x85 mm).

Now I usually use Bris together with k2pdfopt, first to crop pdf with Bris then print pdf in landscape mode with k2pdfopt with wrap turned off, margins on zero and output 185 ppi instead of 167.

What I would like to see in new version of k2pdfopt is posibility to crop ocr-ed pdf image at exactly the text width !!!

Is it possible to somehow use ocr coordinates to crop pdf image?

It is matter of seconds to crop textual pdf with many viewers or editors there, so why is there no such command for pdf image with ocr in background?



For those who didn't know there is nice free online pdf croping service for files under 10 MB that i sometimes use for ocr-ed pdf image.

You can split your pdf if it is too big and then upload those parts.

http://stripdf.com/

Last edited by markom; 09-05-2012 at 07:54 PM.
markom is offline   Reply With Quote
Old 09-05-2012, 11:04 PM   #142
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Cropping OCR'd text

Quote:
Originally Posted by markom View Post
What I would like to see in new version of k2pdfopt is posibility to crop ocr-ed pdf image at exactly the text width !!!
Markom--I'm not quite sure what you're getting at since the whole objective of k2pdfopt is to magnify the text and/or to crop out excess white space and margins. Maybe you want to e-mail me via my web site and we can discuss it offline?
willus is offline   Reply With Quote
Old 09-06-2012, 12:30 PM   #143
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by willus View Post
Markom--I'm not quite sure what you're getting at since the whole objective of k2pdfopt is to magnify the text and/or to crop out excess white space and margins. Maybe you want to e-mail me via my web site and we can discuss it offline?
Sometimes removing black margins close to the text (due to bad scanning) is not perfect with k2pdfopt or any other application there.
I mean it is always at least good enough for reading on my small eink reader in landscape mode and I'm very glad with it, but on small screen every millimeter is sometimes important so i prefer to use Briss to crop pdf image as closer to the text as possible and then use this cropped pdf in k2pdfopt, but even then sometimes result is not perfect or it takes more time.

I'm talking about pdf scans with ocr in the background (searchable image) here.

If there was tool to automatically crop such pdf at the text width(size) i.e. maybe (if possible) by somehow using already existing ocr in background for necessary data where to exactly cut the front image, there will be no need to manually draw rectangles like in Briss or Pdfscissors or to try different margin values in k2pdfopt for different pages and result would always be near perfect.

So, yes, this wish of mine is not directly connected with k2pdfopt itself but as you've mentioned on your pages:

"... A future release might also have an option for a different type of output that would use cropping instructions rather than rasterizing to generate the converted file (similar to what is done in Cut2Col, SoPDF, and the latest version of PaperCrop, which all leave the text in searchable form if it started that way in the original file)."

maybe you can figure things out and grant us another great cropping tool.

Last edited by markom; 09-06-2012 at 02:08 PM.
markom is offline   Reply With Quote
Old 09-06-2012, 11:56 PM   #144
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Cropping to OCR'd text

Quote:
Originally Posted by markom View Post
If there was tool to automatically crop such pdf at the text width(size) i.e. maybe (if possible) by somehow using already existing ocr in background for necessary data where to exactly cut the front image, there will be no need to manually draw rectangles like in Briss or Pdfscissors or to try different margin values in k2pdfopt for different pages and result would always be near perfect.
I think I get what you want and it depends on whether I can figure out how to have MuPDF render only text primitives from the PDF file. I will add it to my k2pdfopt wish list. Thanks for the idea.
willus is offline   Reply With Quote
Old 09-07-2012, 04:07 AM   #145
RefUser
himself
RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!
 
Posts: 15
Karma: 5308
Join Date: Jul 2012
Device: PRS-T1,Nokia N900
Quote:
Originally Posted by willus View Post
I think I get what you want and it depends on whether I can figure out how to have MuPDF render only text primitives from the PDF file. I will add it to my k2pdfopt wish list. Thanks for the idea.
I'm not sure if MuPDF allows rendering of page regions only, but it provides bounding rectangles for text regions (see http://www.mupdf.com/doc/source/fitz.h#1583). So even if you have to render the entire page, you know where the text is. I use this in my pdf viewer to dectect (or better guess atm) columns and crop and zoom accordingly (https://gitorious.org/flatboat/flatb...ge.cpp#line116).
RefUser is offline   Reply With Quote
Old 09-07-2012, 10:13 PM   #146
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
k2pdfopt v1.50 released

K2pdfopt v1.50 is released. The major new feature is optical character recognition (OCR--English only), but there are several other new features that various users have requested. I've also released the source code. See the web site for more details.
willus is offline   Reply With Quote
Old 09-08-2012, 07:57 AM   #147
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Fitting tall figures to the page

Quote:
Originally Posted by cscat View Post
1. Can't you preserve images somehow, without changing their size? I mean if the image is large, don't split it just fit it into the page (or next blank page)? because right now, it splits big images. If you don't do that, then the reader can zoom in for those pictures (given this case happens rarely, it won't be too inconvenient for readers).
Cscat--In v1.50, use the command line option: -f2p -1
(or interactive menu option "bp"). This will fit arbitrarily tall figures to the device page size.
willus is offline   Reply With Quote
Old 09-08-2012, 08:59 AM   #148
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by willus View Post
I think I get what you want and it depends on whether I can figure out how to have MuPDF render only text primitives from the PDF file. I will add it to my k2pdfopt wish list. Thanks for the idea.
Idea number two, but please don't laugh too loud rolling on the floor if this sounds silly to you

Now that 6" eink readers are pretty cheap, I read some of those problematic-size pdf's like magazines, newspapers, A4 or bigger books etc. with two e-readers situated next to each other.

Two kindles are about 18 cm wide in portrait mode and 24 cm wide in landscape position, meaning much wider than kindle DX at about 20.

So, there is not necessarily need for croping the margins but just for cutting the pdf vertically in half for the left and right kindle.

I've already done it number of times using k2pdfopt.
For landscape position: display dimensions 560x1470, landscape, wrap off, column detection off, margin close to zero, output about 190.
For two kindles in portraite position: (-h 735 -w 1120 -wrap- -m 0 -col 1 -vb -1)

After k2pdfopt I've been using free cutting tools (pdfscissors, a-pdf page cut etc.) for cutting k2pdfopt pages in half in a few seconds.

Reading pdf's this way is not problematic at all.
It is good though to use some cardboard or similar beneath readers as a shim for easier holding and fixing readers together.

Anyway, those interested should now know that they can use k2pdfopt for this purpose also, but it would be nice if we could do it all within k2pdfopt only.

Last edited by markom; 09-08-2012 at 12:57 PM.
markom is offline   Reply With Quote
Old 09-08-2012, 01:32 PM   #149
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Double the fun

Quote:
Originally Posted by markom View Post
Now that 6" eink readers are pretty cheap, I read some of those problematic-size pdf's like magazines, newspapers, A4 or bigger books etc. with two e-readers situated next to each other.

... it would be nice if we could do it all within k2pdfopt only.
Interesting concept. Not hard to implement if you don't need a clean gap between the two side-by-side images (just save each half of the output image to separate PDF files), but I'd think it would be distracting to have words or even individual letters split across the two kindles. Seems a lot more cumbersome and less convenient than just getting a bigger e-reader or tablet.
willus is offline   Reply With Quote
Old 09-08-2012, 05:37 PM   #150
RefUser
himself
RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!RefUser , Klaatu Barada Niktu!
 
Posts: 15
Karma: 5308
Join Date: Jul 2012
Device: PRS-T1,Nokia N900
Quote:
Originally Posted by willus View Post
K2pdfopt v1.50 is released. The major new feature is optical character recognition (OCR--English only), but there are several other new features that various users have requested. I've also released the source code. See the web site for more details.
Great news! Thank you very much.
It compiles well on ARM btw (OCR switched off for now).
I tested it on a 25 page document, which took about 10.5 minutes.
RefUser is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 10:32 AM.


MobileRead.com is a privately owned, operated and funded community.