|
|
#1 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 736
Karma: 1123
Join Date: Dec 2009
Device: PRS-505, PRS-600, iPad 16GB Wifi
|
Advise for scanned pdf
Is there a program which can increase or somewhat re-render text in a scanned pdf without having to deal with single pages? |
|
|
|
|
|
#2 |
|
PRS+ author
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,637
Karma: 2446233
Join Date: Dec 2007
Device: Sony PRS-300, 505, 600, 650, 950
|
Did you try this?
http://www.mobileread.com/forums/showthread.php?t=13135
__________________
PRS+ project: Folders, Book History, Key Bindings(programmable keys) enabling downloads from the built-in browser (950) and other goodies for Sony PRS 505 & 300 & 600 (CLICK HERE to see it on youtube). Now also on 350/650/950 models. |
|
|
|
|
Enthusiast
|
|
|
|
#3 |
|
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 94
Karma: 999884
Join Date: Jun 2009
Device: prs700, i-mate JAMin, smartq v7, GeeksPhone Zero, iPad 3rd Gen
|
scantailor when color/grayscale output is selected
Regards |
|
|
|
|
|
#4 |
|
Amateur Radio
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,936
Karma: 2506296
Join Date: Sep 2006
Location: USA
Device: Kindle Touch, iPad 3, iPhone 5
|
I've yet to view a scanned PDF that was very readable on a reader. Unless you can use OCR software, most of which does a lousy job of converting scanned images to reflowable text, you will generally have crap on your screen. If you can use OCR software and create reflowable text, and then if you spend hours editing to correct the copious mistakes of the OCR software and to format the book so that images and tables appear where and as they should, you might wind up with a decent ePub. It is a lot of work that might take as long as reading the doc in printed format. Notebook paper sized PDF files are not designed to be viewed on a book reader. They are designed to be printed. They really don't even work that well on large computer monitors.
Bottomline: If you can create truly reflowable text from the scanned doc, then with some work you can create a very readable ebook. If your scanned doc looks like a copy of a copy of a copy of a copy of a copy, that is rather fuzzy and difficult to read even when printed, then you probably won't be able to create a usable ebook.
__________________
Jack Currently: Kindle Touch, iPhone 5, and iPad 3. |
|
|
|
|
|
#5 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 736
Karma: 1123
Join Date: Dec 2009
Device: PRS-505, PRS-600, iPad 16GB Wifi
|
I just gave this one a try, but it looks like all pages have to be single pictures. I tried it on a picture I had in my files, but I'm not sure what the program did. I made a project, applied some different settings and saved the project. How do I save my efforts?
On the other side, processing all pictures of a file is tedious, even if there's batch processing. I also would have to extract all pages first from my pdfs. So it's quite a bit of work. Like stated in my first post, my pdfs are mostly readable. I would have wished to increase contrast of the text, but I'm not too eager to jump through loops as long as the files are readable sufficiently. That pdfs made for letter sized prints or monitors is not the best thing for readers is obvious. Still, having the portability of the files on my reader is just plain awesome. I'm not looking for the 100% solution. I was just checking if I could increase my current 80% to somewhere between 85-90% without too much effort ![]() Anyway, thanks a lot for the help |
|
|
|
|
|
#6 | |
|
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 94
Karma: 999884
Join Date: Jun 2009
Device: prs700, i-mate JAMin, smartq v7, GeeksPhone Zero, iPad 3rd Gen
|
Quote:
I apologize for the forgotten issues. I thought that you begun with single images not with a pdf. Any way, pdftk can "burst" a multipage pdf to single page pdf. then with "convert" www.imagemagick.org you can convert them to png, jpg ... and process them with scantailor. After processing you will obtain a set of tiff. From this a pdf can be obtained with tiff2ps and ps2pdf or if you want, build a cbr (just rar thw whole set, after png or jpg conversion, and rename) and filter thru calibre or pdflrf. I have done this with a badly scanned comic with good results but not with a text. I will try to do some experiments to post to this thread. Regards |
|
|
|
|
|
|
#7 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 736
Karma: 1123
Join Date: Dec 2009
Device: PRS-505, PRS-600, iPad 16GB Wifi
|
Having a comparison would be awesome. I would appreciate your input.
![]() Still, it sounds tedious!?!? |
|
|
|
|
|
#8 | |
|
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 94
Karma: 999884
Join Date: Jun 2009
Device: prs700, i-mate JAMin, smartq v7, GeeksPhone Zero, iPad 3rd Gen
|
Quote:
Yes, it is tedious. Lets begun with a pdf, Sample.pdf, in a Linux environment and finally I'd managed to do the whole thing without scantailor Obtain individual pages with pdftk: pdftk Sample.pdf burst This outputs in the same directory pg_0001.pdf pg_0002.pdf and so on, in this case ends with the third page Adjust individually the contrast and perform some filtering: for a in $(seq -w 1 3); do convert -contrast -enhance pg_000$a.pdf eq_pg_000$a.pdf;done (replace 3 with the last page, caution is needed with the number of zeroes, if more than 9 pages, and less than 100, then use pg_00$a.pdf instead as input and eq_pg_00$a.pdf as output and so on, the -w in seq means ad zeroes to the left) after this we have eq_pg_0001.pdf eq_pg_0002.pdf eq_pg_0003.pdf in the working directory. Now we want to build a new pdf with the processed pages: pdftk eq*.pdf cat output eqSample.pdf and that's all. Convert is a cross platform tool http://www.imagemagick.org/script/index.php. I don't know how to make scripts in MSDOS. I could try good old Digital DCL in VMS :-)) Hope this helps, regards. |
|
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Scanned PDF onto Kindle 2. Help! | Tac420oma | 6 | 07-20-2012 08:42 AM | |
| PRS-600 Dictionary on scanned PDF? | antistar | Sony Reader | 8 | 11-29-2009 03:05 PM |
| Some Calibre PDF>Mobi conversion advise please | AdrianC | Calibre | 3 | 09-16-2009 02:00 PM |
| Ok I have scanned pdf books....but | DeathtoToasters | Sony Reader | 38 | 11-04-2008 07:51 PM |
| pdf with scanned images | Leite | iRex | 5 | 08-18-2008 12:54 PM |