A quick look at your results shows that first few pages have a 100% correct "recognition" of paragraphs, but the rest of the document has 0%, which shows to me that you have corrected a few first pages by hand, just to "fake" the results.
And you are right, this is not a good example of pdf, because is untagged, and you cannot get better results using the copy/past method.
|