I have found some pdfs that have this problem. Calibre uses the pdftohtml tool to pull the text out of a pdf, and for some reason that can fail. Take the pdf out of Calibre and try using the pdftohtml tool from the command line and you get nothing, but try the pdftotext tool and you usually do get the text. I've never seen an answer on why some text that is definitely there does not respond to pdftohtml. Another example of the evil behaviour of pdfs!
|