Quote:
Originally Posted by BetterRed
Have a look for imaging forensic tools.
|
Hmmm... that may be another angle to research.
I know a lot of times they "average" pixel colors of entire rows/columns to get semi-unique fingerprints. Perhaps something like that could be used to detect lines too.
* * *
Tonight I was dabbling a bit more with Hough Lines.
I had quite a bit of success locating the line through the text.
ImageMagick Hough Lines
Original Image:
Step 1: Inverse the scan using ImageMagick's
canny (see fmwconcepts link in Post #1):
Code:
convert test.png -canny 0x1+10%+40% test_inverse.png
Step 2: Then calculate Hough Lines:
From testing, on this specific book, I found a threshold between 500-700 worked:


The higher the threshold, the more "false positives" disappeared.
Step 3: Overlay Hough Lines with image:


Here's the same steps with another page:



* * *
Side Note: To see what a Hough Line calculation is actually doing, I found this part of the video did a decent job explaining it visually:
https://youtu.be/4zHbI-fFIlI?t=219
It goes row-by-row detecting each white pixel, then spins a line in a 360. Plotting this leads to points of various strength (which tells you probable locations + angles of lines).