Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > General Discussions

Notices

Reply
 
Thread Tools Search this Thread
Old 03-09-2012, 04:17 PM   #1
crackhammer
Enthusiast
crackhammer began at the beginning.
 
Posts: 47
Karma: 10
Join Date: Jun 2009
Device: Nook touch, iPad, Xoom
Enhancing text in scanned images

So I am in the process of scanning a book which is a mix of lot of images and text. I scanned the book using Epson Scan program and in home mode with text/art option @ 300 dpi tiff.

Now that the book is processed through ScanTailor, the text in the images doesn't look sharp. Is there any way that I can sharpen the images to enhance the text?

Thanks in advance.
crackhammer is offline   Reply With Quote
Old 03-09-2012, 04:44 PM   #2
mr ploppy
Feral Underclass
mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.
 
mr ploppy's Avatar
 
Posts: 3,622
Karma: 26821535
Join Date: Jan 2010
Location: Yorkshire, tha noz
Device: 2nd hand paperback
It'll be the de-screening option that makes the text fuzzy, but you'll need it for the pictures or you'll get a moire pattern. Best thing to do depends on what you have in mind for a final product. If it's OCR, scan the text separate without any de-screening. If you just want a set of pictures with text on them, scan them at double the size you want them to be without the de-screen, apply a gaussian blur to the images only, then resize the whole thing down 50%. That's pretty much what the de-screener on cheap scanners do in memory, but if you do it yourself you can undo any results you don't like.
mr ploppy is offline   Reply With Quote
Advert
Old 03-09-2012, 04:59 PM   #3
osnova
Kindler of the Flame
osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.
 
osnova's Avatar
 
Posts: 582
Karma: 646016
Join Date: Oct 2009
Location: US of A
Device: K DX,3,KT,KP,KF, KFHD; Nook C, PRS600, iPad, Xoom, N900, N810, Zaurus
While I responded in another thread, I can also put a couple of words here. To have the sharpest text in scans, scan into 300 dpi grayscale tiff with all processing turned off in the scanning app, process through ScanKromsator and upscale to 600 dpi B/W (SK is a pain to learn though). This is the only way to get razor sharp text that I've found.

Last edited by osnova; 03-09-2012 at 05:56 PM.
osnova is offline   Reply With Quote
Old 03-09-2012, 05:08 PM   #4
crackhammer
Enthusiast
crackhammer began at the beginning.
 
Posts: 47
Karma: 10
Join Date: Jun 2009
Device: Nook touch, iPad, Xoom
@ mr ploppy,
Final product on my mind is, a pdf file with searchable text. The reason behind it is, I usually highlight the text of my interest and thats about it. I am not interested in perfect OCR.

@ osnova,
The program you mentioned seems to have loads of setting. I am pretty comfortable with SK, although I admit that I don't 'understand' all of its options. I will give a try to your program on my next project. Thanks for heads up.
The current book that I am scanning has loads of images (and they are necessary for the text in the book) So no black and white option for this book, has to be color.
crackhammer is offline   Reply With Quote
Old 03-09-2012, 05:14 PM   #5
frostschutz
Linux User
frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.
 
frostschutz's Avatar
 
Posts: 2,279
Karma: 6123806
Join Date: Sep 2010
Location: Heidelberg, Germany
Device: none
Upscaling doesn't sound right to me...

If the text is blurry / small w/o postprocessing - scan with higher DPI. Otherwise just find the right threshold, i.e. turn it into pure b&w without gray.
frostschutz is offline   Reply With Quote
Advert
Old 03-09-2012, 05:16 PM   #6
osnova
Kindler of the Flame
osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.
 
osnova's Avatar
 
Posts: 582
Karma: 646016
Join Date: Oct 2009
Location: US of A
Device: K DX,3,KT,KP,KF, KFHD; Nook C, PRS600, iPad, Xoom, N900, N810, Zaurus
Quote:
Originally Posted by crackhammer View Post
The current book that I am scanning has loads of images (and they are necessary for the text in the book) So no black and white option for this book, has to be color.
The advanced features of SK can break the page into zones and process these zones separately. So, say you scale the text up to 600 dpi B/W while leaving the pictures in 300 dpi color and then save to pdf from SK directly. I've seen this done by gurus of SK but have never done this myself
osnova is offline   Reply With Quote
Old 03-09-2012, 05:19 PM   #7
osnova
Kindler of the Flame
osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.
 
osnova's Avatar
 
Posts: 582
Karma: 646016
Join Date: Oct 2009
Location: US of A
Device: K DX,3,KT,KP,KF, KFHD; Nook C, PRS600, iPad, Xoom, N900, N810, Zaurus
Quote:
Originally Posted by frostschutz View Post
Upscaling doesn't sound right to me...
Maybe I am using the wrong terminology but I have seen with my own eyes when so-so grayscale text turns into crisp sharp 600 dpi B/W text after SK. I actually used to do it myself quite often.

Under its hood, SK uses some advanced math algorithms (one drawback of this program is that it takes a long time even on the advanced hardware to work, I am talking hours at times). If I am guessing correctly, the author is some math guy from Russia.

Last edited by osnova; 03-09-2012 at 05:22 PM.
osnova is offline   Reply With Quote
Old 03-09-2012, 06:11 PM   #8
mr ploppy
Feral Underclass
mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.
 
mr ploppy's Avatar
 
Posts: 3,622
Karma: 26821535
Join Date: Jan 2010
Location: Yorkshire, tha noz
Device: 2nd hand paperback
Quote:
Originally Posted by crackhammer View Post
@ mr ploppy,
Final product on my mind is, a pdf file with searchable text. The reason behind it is, I usually highlight the text of my interest and thats about it. I am not interested in perfect OCR.
For searchable text you will need to OCR it, so scan the images separately and put it together in a layout program to convert it to PDF.
mr ploppy is offline   Reply With Quote
Old 03-09-2012, 06:23 PM   #9
mr ploppy
Feral Underclass
mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.mr ploppy ought to be getting tired of karma fortunes by now.
 
mr ploppy's Avatar
 
Posts: 3,622
Karma: 26821535
Join Date: Jan 2010
Location: Yorkshire, tha noz
Device: 2nd hand paperback
Quote:
Originally Posted by osnova View Post
Maybe I am using the wrong terminology but I have seen with my own eyes when so-so grayscale text turns into crisp sharp 600 dpi B/W text after SK. I actually used to do it myself quite often.
You might mean bitmapping, where it changes it to pure black and white (2 colours)? That would definitely need an increase in resolution for it to work, but for normal greyscale increasing the resolution would only increase the blockiness of any text.
mr ploppy is offline   Reply With Quote
Old 03-09-2012, 06:29 PM   #10
Penforhire
Wizard
Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.Penforhire ought to be getting tired of karma fortunes by now.
 
Posts: 2,230
Karma: 7145404
Join Date: Nov 2007
Location: Southern California
Device: Kindle Voyage & iPhone 7+
Sounds like you should be able to find some scan/post-scan settings to improve it. Otherwise it is cumbersome but possible to improve such TIFF images in a good photo editor like Photoshop. There are filter tools like Unsharp Mask, image 'adjustments' like Threshold, and direct tonal adjustment (e.g. Levels dialog box).
Penforhire is offline   Reply With Quote
Old 03-09-2012, 06:58 PM   #11
osnova
Kindler of the Flame
osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.osnova ought to be getting tired of karma fortunes by now.
 
osnova's Avatar
 
Posts: 582
Karma: 646016
Join Date: Oct 2009
Location: US of A
Device: K DX,3,KT,KP,KF, KFHD; Nook C, PRS600, iPad, Xoom, N900, N810, Zaurus
Quote:
Originally Posted by mr ploppy View Post
You might mean bitmapping, where it changes it to pure black and white (2 colours)? That would definitely need an increase in resolution for it to work, but for normal greyscale increasing the resolution would only increase the blockiness of any text.
Yes, SK has selection for "binarization" algorithm and thresshold but there is more to it. The way I understand it, SK uses data in grayscale to extrapolate pixels for binarization and increased dpi. So, it is not just some dumb grayscale to B/W. When the conversion (300 dpi grayscale to 600 dpi B/W) is done, the "blockiness" of the text is reduced, not increased while blurry text becomes sharp.

It seems like magic, when the text that is in the shadow (say close to the spine), appears clear without a shadow and in increased resolution.

===

I also saw very good results from people processing scans through Corel PaintShop Pro but it is beyond my skills/abilities. They practically separated the text layer from the background layer.

Last edited by osnova; 03-09-2012 at 07:11 PM.
osnova is offline   Reply With Quote
Old 03-09-2012, 07:28 PM   #12
frostschutz
Linux User
frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.
 
frostschutz's Avatar
 
Posts: 2,279
Karma: 6123806
Join Date: Sep 2010
Location: Heidelberg, Germany
Device: none
Quote:
Originally Posted by osnova View Post
When the conversion (300 dpi grayscale to 600 dpi B/W) is done, the "blockiness" of the text is reduced, not increased while blurry text becomes sharp.
Sounds a bit like what this scaler does:

http://www.hiend3d.com/hq4x.html

While those examples look impressive, you should be aware that it's just eyewash. Such a process does not add real detail, and there are cases when such a filter actually makes things worse.

If you want 600 dpi, you have to scan 600 dpi. There is no other way.
frostschutz is offline   Reply With Quote
Old 03-09-2012, 07:29 PM   #13
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 2,986
Karma: 18343081
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
I had this same problem when scanning some of my old physics books. The problem is that the scanned text is usually dark grey rather than black, and the poorer contrast of E-Ink displays makes it hard to read. I ended up writing a Java application for selecting rectangular regions in the scans to leave as grayscale, and changing everything else to black or white. The SK program that @osnova describes sounds better than my home-built job. My only point is that you should convert the text to black and white to improve the contrast.
rkomar is offline   Reply With Quote
Old 03-09-2012, 07:35 PM   #14
frostschutz
Linux User
frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.
 
frostschutz's Avatar
 
Posts: 2,279
Karma: 6123806
Join Date: Sep 2010
Location: Heidelberg, Germany
Device: none
Quote:
Originally Posted by rkomar View Post
I ended up writing a Java application for selecting rectangular regions in the scans to leave as grayscale, and changing everything else to black or white.
I did that for a text that had mixed text/images once. Except I didn't write a Java application - I just used Gimp. Lots of manual labor but it's worth the effort when you can adjust the contrast of each image individually. And having everything but the images in pure black&white helps a lot getting the PDF down to managable size.
frostschutz is offline   Reply With Quote
Old 03-09-2012, 08:09 PM   #15
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 2,986
Karma: 18343081
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
Quote:
Originally Posted by frostschutz View Post
I did that for a text that had mixed text/images once. Except I didn't write a Java application - I just used Gimp. Lots of manual labor but it's worth the effort when you can adjust the contrast of each image individually. And having everything but the images in pure black&white helps a lot getting the PDF down to managable size.
I left the scans as grayscale data (with the text areas forced to be black (0) or white (255)). So, there was some savings in size because the black and white areas compressed better than the grayscale, but it wasn't huge. Were you able to store the text as monochrome (i.e. 1-bit) data for better size savings? If so, how? Could you save them as different layers on the page?
rkomar is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanned text pdf with OCR but graphical layer instead vectorial whopper PDF 2 09-10-2011 06:32 PM
TEXT WRAP AROUND IMAGES RKEP71 Sigil 1 06-20-2011 08:26 PM
Help - text won't wrap around Images WRB Sigil 2 02-11-2011 01:46 AM
pdf with scanned images Leite iRex 5 08-18-2008 12:54 PM
iLiad Enhancing Poppler for iLiad scotty1024 iRex Developer's Corner 10 12-20-2006 12:02 PM


All times are GMT -4. The time now is 11:19 PM.


MobileRead.com is a privately owned, operated and funded community.