View Single Post
Old 12-06-2013, 12:43 AM   #1
Genre fan
Member
Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.Genre fan 's shirt has a full set of merit badges.
 
Posts: 13
Karma: 16858
Join Date: Nov 2013
Location: USA
Device: Sony PRS-950, PRS-350
Character Encoding: How to fix it?

I have Sony PRS-950 and PRS-350 devices.

In the last year, I've been getting books with odd characters instead of punctuation, which make the books/chapters difficult to read. In playing around with my browsers and View -> Encoding menus, I have figured out that it has something to do with the Character Encoding within the epub files.

Example: I’ve is printed instead of I've.

’ for apostrophe
“ the opening of a quotation,
� for closing the quotation,
and I think — is for a hyphen.

When a sentence had “’m for " 'm at the beginning of a speech (when the character was slurring his words) it took me a while to figure out how it was supposed to read.

This was in one recent book:

“’Sides, ’tis only for a moon. That ain’t long.�

Translation: " 'Sides, 'tis only for a moon. That ain't long."
See what I mean about it being really hard to read?


I buy books from several ebook stores and I borrow from the library.
The problem may be the entire book, but it is usually restricted to a few chapters, with rare occasion where the encoding changes within a chapter. Usually it is for a whole chapter, not part, and it can be seen in chapters not consecutive to each other.

It occurs whether the book is downloaded directly to my 950 reader or if I load it to either reader from my computer(s), which are all Mac OS X of several versions from 10.4 to Mountain Lion.
Since it happens when the book is downloaded directly, I figure the operating system of my computer is not relevant.

There are several publishers involved, though http://www.baenebooks.com/ (no DRM!) has not so far been one of them, IIRC. I haven't actually purchased Baen ebooks from any source except the publisher, so there is a slight possibility that the problem is dependent on the store/source and not the epub file as originally published. I know I get this in books from the Sony Reader store and from Kobo Books. I haven't purchased enough books from other vendors in the last few months to have a large enough sample to say anything about other stores.

However, if I view the books with any viewer on my computer, the encoding is the same. I've read them in Calibre (after stripping DRM -- for my personal use only! -- so that I can actually look at the books in the viewer), in the Sony Reader App, and in Adobe Digital Editions 2.0. It's always the same.

I believe the encoding is inherent to the files. I would like to fix this if I can to make the books I've purchased more enjoyable to read on my ereaders.

Any ideas?

BTW, to paraphrase Bones McCoy, "I'm a doctor, not a software engineer!". It would be really helpful if any suggestions don't assume that I know what you are talking about. Links or specific steps would be very helpful. I have Calibre on my computer, but I am very much a beginner to using it. I can add books and change some of the obvious metadata, but that's about it. I looked a bit at the online user manual and it has stuff about converting books, but it's not clear to me if that's what I should try to do.
Genre fan is offline   Reply With Quote