Thanks for this plugin!
I had problems with some books. One of them I would get this exception:
XMLSyntaxError: PCDATA invalid Char value 24, line 159, column 54
After some print statements, I noticed the xml generated in function _read_pdf_text from file scan.py had some invalid characters.
So I modified it to replace most of non-printable chars by something else ('_').
I'm attaching a diff of the modifications I did.
patch.scan.py.txt