View Single Post
Old 03-01-2020, 01:35 PM   #23
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,796
Karma: 103362673
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by j.p.s View Post
One form of archive.org book files have a mask image for each page that can be used to make white regions of the page completely white. Have you ever used those to help clean up the page?
Quote:
Originally Posted by willus View Post
Interesting. No--I had not heard of this before.
It's been a couple of years since I've worked with them, so I'm fuzzy on the details. I mentioned it in passing in post #3 in the thread
https://www.mobileread.com/forums/sh...d.php?t=178155
Some scripts working with the mask images are in the first attachment.

The images (of pages of text) in at least some archive.org PDF files are combinations of 2 PBM images and a PGM image. One of the PBM images is the mask. I discovered this when I ran a utility to extract images from a PDF and have no idea how the PDF standard addresses this or how PDF libraries and utilities make use of the mask images.

Last edited by j.p.s; 03-01-2020 at 01:38 PM.
j.p.s is offline   Reply With Quote