View Full Version : EPUBreader for pure ASCII only?


paulpeer
12-05-2009, 06:21 AM
I've tried to read some non-English books and all accented characters are scrambled. Does EPUBreader support foreign languages? Or should I indicate that the source is in UTF-8 character set somewhere?

mikelv
12-05-2009, 07:43 AM
I've tried to read some non-English books and all accented characters are scrambled. Does EPUBreader support foreign languages? Or should I indicate that the source is in UTF-8 character set somewhere?
Hi Paul, thanks for your feedback!

EPUBReader supports also foreign languages. It works without problems for example with hebrew or cyrill epubs.

Could you please post the epub itself or a link? I'd like to check this.

paulpeer
12-05-2009, 10:21 AM
EPUBReader supports also foreign languages. It works without problems for example with hebrew or cyrill epubs.

Thanks for your quick reply! But I feel a little stupid now :blink: The books that I have watched a few hours ago now are quite perfect in EPUBReader. I'm very sure that the accented characters looked like "7%" or something similar. And I assure you that I wasn't drunk!

Anyhow, problem solved, although I cannot figure out what happened ...

Sorry to have bothered!

Paul

mikelv
12-05-2009, 12:27 PM
Thanks for your quick reply! But I feel a little stupid now :blink: The books that I have watched a few hours ago now are quite perfect in EPUBReader. I'm very sure that the accented characters looked like "7%" or something similar. And I assure you that I wasn't drunk!

Anyhow, problem solved, although I cannot figure out what happened ...

Sorry to have bothered!

Paul
Hi Paul, no problem :)! If the problem occurs again, please let me know.

brendanl79
12-09-2009, 02:39 PM
The first ePub I tried had Unicode problems with quotes and apostrophes. Table of Contents renders fine but in the page viewer I see things like "doesn’t". Afraid I cannot provide a sample due to my "inept"-itude.

Book renders fine in Calibre, confirmed with iconv that it's valid UTF-8, manual override of encoding to UTF-8 in Firefox does not fix it.

Halp?

mikelv
12-09-2009, 03:38 PM
Hi brendanl, thanks for your feedback!

I think without an example, I'm not able to help you. Please post the epub you tried or a link where I can download it. Where do you see the problems to provide a sample?

brendanl79
12-10-2009, 11:33 AM
OK mikel, sent example and screenshot to your project gmail.

mikelv
12-10-2009, 12:07 PM
OK mikel, sent example and screenshot to your project gmail.

Thanks!

The problem is caused by a duplicate "content" meta-tag:

<meta content="text/html; charset=iso-8859-1"/>
<meta content="http://www.w3.org/1999/xhtml; charset=utf-8" http-equiv="Content-Type"/>

If you delete the first one, everything is okay.

I've seen, that the epub was generated by Calibre. Somebody else reported the same problem with another epub which is located at Mobileread: http://www.mobileread.com/forums/showthread.php?t=62489

I don't know if this duplicate meta tags occur because the creator made an error or if this is an error of Calibre itself. I'll post this at the Calibre thread and will see, what they say.

mrmikel
12-10-2009, 12:34 PM
Mike,

He might want to use the format of the first line but sub in the utf-8 if the epub is to be used on a device that is not web connected. My 505 would not be able to use the information since it can not get on the web.

But if it is only for a web connected device no problem.

mikelv
12-10-2009, 12:56 PM
Mike,

He might want to use the format of the first line but sub in the utf-8 if the epub is to be used on a device that is not web connected. My 505 would not be able to use the information since it can not get on the web.

But if it is only for a web connected device no problem.

Thanks for your posting!

In the following I've posted an example of the text which can be found in the content pages:

We’ve all heard the Spider-Man saying “...

As you can see, there are many wired characters which are not part of the latin-1 characterset (iso-8859-1). So the first content-meta-tag is in my opinion wrong and should be deleted.

Perhaps I misunderstood your point. If so, it would be great if you could explain it again :).

mikelv
12-10-2009, 02:33 PM
I've seen, that the epub was generated by Calibre. Somebody else reported the same problem with another epub which is located at Mobileread: http://www.mobileread.com/forums/showthread.php?t=62489


I asked the creator of the epub I mentioned above, why he uses two content metatags with different charactersets. Here is his answer:

"I converted the html from iso to utf and didn't have a look at it again as I thought the first content meta tag was automatically gone. I'm gonna change that!"

I guess this is exactly the same what happened to the epub brendanl mentioned and in this case the first meta tag should also be removed.

mrmikel
12-11-2009, 05:12 AM
In creating books for my PRS505 I try to eliminate all web references, since it can't reach out to the web. That was why I was suggesting using the second, but changing it to utf-8. But it probably doesn't make any difference since I don't think it actually go to the web to access the encoding. It was more a general idea about creating all purpose epubs than a remark about his particular case since his device is web connected anyway so doing as you suggest solves his problem.