Originally Posted by arnoud999
Are you still working on this program at the moment? Because I have a lot of PDFs where your program can't find the text to be highlighted, and instead adds is as a note to the page. I think this often has to do with linebreaks. I was thinking, would it be possible to 'break up' highlights, line per line, so that problem is averted? Or, if linebreaks are not identifiable, just making the program try every possible way of breaking the highlighted text up into different segments.
I certainly mean to be working on it, but as you see, I haven't done much recently.
I'm hoping to get the pdfloc method
for identifying highlighed text working, since this should solve the problems you've been having. But I haven't had a free weekend to sit down and work it out. If you check out that bug report, you'll see a link to a small program to spit out the pdfloc information. You're welcome to give that a go and see if you can find a pattern.
If we can't figure out this new method, something like what you suggest is a good idea that should fix many of the problems.
Thanks for the kind words and suggestions!