View Single Post
Old 05-30-2012, 10:00 AM   #7
Schrollini
Member
Schrollini began at the beginning.
 
Posts: 23
Karma: 14
Join Date: Apr 2012
Device: Sony PRS-T1
Hi kopytozpakone,

Glad to hear that PRSAnnots is sort of working for you. We'll see if we can fix your problems, but I suspect you may have run up against a current limitation in PRSAnnots.

First, I don't think you're having a problem with PDFMiner. If you were, PRSAnnots would probably exit with a confusing error message. But if you want to be sure, you can open a python interpreter, type import pdfminer, and press Enter. If you don't get an error message, it means that it's installed okay.

The problem is that, right now [1], PRSAnnots tries to find the text to highlight by looking for the string of text that the ereader thinks is highlighted. However, text in PDFs is just a bunch of characters placed on the page -- there's no unique way to turn those characters into strings. Sometimes the ereader and PDFMiner, the library I use, will do this is different ways. When this happens, PRSAnnots may not be able to find the text that the ereader says is highlighted, so it gives you that error message.

What it is supposed to do is tell you the text it was looking for after that error message. Do you see anything following the error message? (It probably looks like textwithnospaces.) Does this happen for every bit of highlighted text, or just some of them? Does it happen on all PDFs, or just a few? If it's an intermittent problem, then you've likely run up against the limitations described above. But if it happens everytime, then there may be something else wrong.

[1] I think I know a better way to do this, but it will take some reverse engineering. It's described here, and I welcome all the help I can get.
Schrollini is offline   Reply With Quote