MobileRead Forums - View Single Post - Kindle clippings from pdfs have no whitespace

bmf · 09-01-2011, 05:37 PM

Hi - this pdf certainly has genuine text as opposed to scanned pages. I can use pdftohtml and see the text in the resulting html file with whitespace.

Dumping just the offending line from the clippings file to another file and then running chardet against it just outputs the encoding type ascii. I have a bad feeling my kindle is somehow misinterpreting the pdf and I'm loosing the data at the time the highlight is written to clippings.txt and therefore I'm stuffed!

09-01-2011, 05:37 PM	#3
bmf Member Posts: 13 Karma: 10 Join Date: Oct 2010 Device: Kindle 3	Hi - this pdf certainly has genuine text as opposed to scanned pages. I can use pdftohtml and see the text in the resulting html file with whitespace. Dumping just the offending line from the clippings file to another file and then running chardet against it just outputs the encoding type ascii. I have a bad feeling my kindle is somehow misinterpreting the pdf and I'm loosing the data at the time the highlight is written to clippings.txt and therefore I'm stuffed!