View Full Version : Esperanto characters?


pensaro1
03-03-2010, 10:26 AM
I have tried various tools for converting from one format to another, such as Calibre and Sigil. I can save in ePub from Open Office but get question marks instead of Esperanto characters, How do I correct that?:blink:

JSWolf
03-03-2010, 11:00 AM
I have tried various tools for converting from one format to another, such as Calibre and Sigil. I can save in ePub from Open Office but get question marks instead of Esperanto characters, How do I correct that?:blink:

You could embed a font that has those characters you want.

dmapr
03-03-2010, 11:02 AM
You need a font that includes the glyphs for these characters. Check the http://www.mobileread.com/forums/showthread.php?t=36361 for information on substituting fonts.

HarryT
03-03-2010, 12:18 PM
The problem is probably not the fonts, but the font mapping. Esperanto uses "eastern European" characters which are not in code page 1252, which is what's most commonly used. Works fine in UTF-8, but not in 8-bit character sets unless you use code page 1250 ("central European").

pensaro1
03-03-2010, 12:29 PM
You could embed a font that has those characters you want.
Hi there and thanks for trying to help but as a novice I'm not sure where I should embed the fonts or even where to get them! I do have EO fonts in windows on my computer. I guess I get them from there so I need to know where to embed. Sorry if I seem ignorant but I don't do much of this.:o

wallcraft
03-03-2010, 07:33 PM
I think OpenOffice will use UTF-8 by default for its standard ODT (http://wiki.mobileread.com/wiki/ODT) output. So you could try saving in ODT and then converting to ePub using Calibre.

Assuming this isn't enough to get it working with (say) Adobe Digital Editions then you need to embed a Unicode font in the ePub. Practically any Unicode font should support Esperanto. You can test your font by reading the ePub using FBReader (which allows user-specified fonts) and selecting the font from its gear wheel icon and the Styles tag. Once you know it works, I think Sigil can embed a font for you. Any of the GPL licenced fonts at Unicode typefaces (http://en.wikipedia.org/wiki/Unicode_typefaces) should be suitable.

pensaro1
03-04-2010, 05:44 AM
I think OpenOffice will use UTF-8 by default for its standard ODT (http://wiki.mobileread.com/wiki/ODT) output. So you could try saving in ODT and then converting to ePub using Calibre.

Assuming this isn't enough to get it working with (say) Adobe Digital Editions then you need to embed a Unicode font in the ePub. Practically any Unicode font should support Esperanto. You can test your font by reading the ePub using FBReader (which allows user-specified fonts) and selecting the font from its gear wheel icon and the Styles tag. Once you know it works, I think Sigil can embed a font for you. Any of the GPL licenced fonts at Unicode typefaces (http://en.wikipedia.org/wiki/Unicode_typefaces) should be suitable.

Hi there and thanks for advice. I converted from odt to epub in Calibre and it looks OK. when viewed in Calibre but not when viewed in ADE. I'm still not sure how to embed the fonts in ePub!:blink: P.S. To show Esperanto characters on my blog I use ISO 8859 1. Does that help? P.P.S. I have just downloaded FBReader and it shows Esperanto characters no problem. Does that mean the problem is with ADE? Also I have found that in FBReader I can copy the text and paste it elsewhere. If I copy protect in the odt file will that protection carry over to an ePub file?

charleski
03-04-2010, 11:49 AM
An ePub must be in UTF-8. Some readers are more forgiving and don't follow the spec closely, but ADE may well complain. Frankly, I think the best thing is for you to make a short ePub and upload it so we can see what's going on.

No, copy-protecting the ODT will not copy-protect the ePub and will probably just cause more problems. Forget about copy-protection unless you have $$$$$$ to spend on a content server for a DRM-scheme that's already cracked.

Valloric
03-04-2010, 05:30 PM
An ePub must be in UTF-8.

It can also be in UTF-16, both LE and BE.

pensaro1
03-05-2010, 07:53 AM
An ePub must be in UTF-8. Some readers are more forgiving and don't follow the spec closely, but ADE may well complain. Frankly, I think the best thing is for you to make a short ePub and upload it so we can see what's going on.

No, copy-protecting the ODT will not copy-protect the ePub and will probably just cause more problems. Forget about copy-protection unless you have $$$$$$ to spend on a content server for a DRM-scheme that's already cracked.

Hi to charleski. I have been experimenting. I can create in OpenOffice and save as epub using a plugin tool. The result can be read properly in FBReader but not in ADE. I have also created a file and saved as odt which I then convert using Calibre. This also reads OK in FBReader but not in ADE. How do I upload the files so you can see?:chinscratch::dunno:

wallcraft
03-05-2010, 08:11 AM
How do I upload the files so you can see? If there are no copyright issues, just attach the epub to a post.

I don't think there is any doubt that the problem is with the default Adobe Digital Editions font. If you want to solve this problem for all devices (i.e. have your ePub work everywhere) then you need to embed a font in the ePub. If you just want to read on your own Reader, then it may be possible to copy a font to the device and slightly modify the ePub's CSS to use that font without actually embedding it.

Any Unicode font that is licensed to allow embedding will likely do, see Unicode typefaces (http://en.wikipedia.org/wiki/Unicode_typefaces) and look in the table for a named license type (GPL, OFL, ...). I think all you need is Latin Extended-A, so any of these fonts will work.

pensaro1
03-05-2010, 01:48 PM
I have come to the conclusion that the problem is with ADE but for what its worth I am attaching the 2 files I created. Perhaps I'll follow this up with Adobe but its not now so important. I will also check out the fonts as suggested. Thanks to everyone for advice.:dizzy2:

charleski
03-05-2010, 06:59 PM
The problem lies with the plugin you're using to create the epubs.

Firstly, it's not creating a valid structure with all the necessary components.
Secondly, it's doing something screwy with the encoding that is stopping ADE from recognising it (I noticed the xhtml was endoded without its BOM, which may be part of the issue).

I recreated the epub in Sigil and the characters show fine in ADE. My advice is to ditch this plugin and use something else, Sigil or Atlantis.

wallcraft
03-05-2010, 09:37 PM
I recreated the epub in Sigil and the characters show fine in ADE. They do show up, but this is using an embedded font: @font-face {
font-family: "Charis";
font-style: normal;
font-weight: normal;
src:url(../fonts/font004.ttf);
}

@font-face {
font-family: "Charis";
font-style: normal;
font-weight: bold;
src:url(../fonts/font001.ttf);
}

@font-face {
font-family: "Charis";
font-style: italic;
font-weight: normal;
src:url(../fonts/font003.ttf);
}

@font-face {
font-family: "Charis";
font-style: italic;
font-weight: bold;
src:url(../fonts/font002.ttf);
}

body {
font-family: "Charis"
}

Nothing wrong with that, but it is why the characters show up correctly. Does Sigil automatically include an embedded font?

charleski
03-06-2010, 02:54 AM
They do show up, but this is using an embedded font:The characters in question are not included in ADE's encoding set (as specified in appendix D of http://www.adobe.com/devnet/pdf/pdfs/PDFReference16.pdf), so you have to embed a font for the glyphs. The problem is that despite embedding a font in the epub he posted the characters still don't show up, even though the font is being used. The plugin's output is mangled in some way that is confusing ADE.

Given that the plugin is also messing up some other things (there's no ncx file and the opf is missing required metadata entries) I don't think it's worth using.

pensaro1
03-06-2010, 11:31 AM
The characters in question are not included in ADE's encoding set (as specified in appendix D of http://www.adobe.com/devnet/pdf/pdfs/PDFReference16.pdf), so you have to embed a font for the glyphs. The problem is that despite embedding a font in the epub he posted the characters still don't show up, even though the font is being used. The plugin's output is mangled in some way that is confusing ADE.

Given that the plugin is also messing up some other things (there's no ncx file and the opf is missing required metadata entries) I don't think it's worth using.

Hi again and thanks again for all the advice. I have again tried using Sigil. Again the result looks fine in Sigil or FBReader but I still get ??? marks in ADE. Maybe it is something to do with the embedded fonts. Do I need to do something in Sigil before creating and saving the file? The truth is I am rapidly losing interest in the epub idea.:tired::dizzy2:

wallcraft
03-06-2010, 11:51 AM
Hi again and thanks again for all the advice. I have again tried using Sigil. Again the result looks fine in Sigil or FBReader but I still get ??? marks in ADE. Maybe it is something to do with the embedded fonts. Do I need to do something in Sigil before creating and saving the file? Are you using the beta version? See Sigil 0.2.0 betas available (http://www.mobileread.com/forums/showthread.php?p=818353#post818353) for a recent discussion of embedded fonts now available in Sigil.

Note that charleski demonstrated that Charis (http://scripts.sil.org/cms/SCRIPTs/page.php?site_id=nrsi&item_id=CharisSILFont) is one of the fonts that supports Esperanto.

charleski
03-06-2010, 12:04 PM
To see Esperanto characters you must embed a font that shows them.
Just take it step-by-step
1) Does the epub I posted show up properly in ADE? It does on my machine.
2) If 1) shows properly, try opening the epub I posted in Sigil and then just paste in your text and save it. Does that show up properly? Sigil 0.18 will preserve any embedded fonts but you have to add them manually (I suspect this is the problem you're still having). If this works then you can just use that epub as a template and paste in text for your final epubs. It has CharisSIL embedded in it, which is free font that you can distribute without any hassles.

The latest version of Atlantis Word Processor handles embedded fonts and the next version of Sigil will as well, though it's still in beta.

paulpeer
03-06-2010, 02:17 PM
I have come to the conclusion that the problem is with ADE
Jes, ADE povas montri Esperanto-tekstojn nur en la korpo de la teksto mem (kaj kondiĉe, ke vi uzas enkorpigitajn tiparojn), ne en la enhavtabeloj.

paulpeer
03-06-2010, 02:43 PM
Here are some free Esperanto books in the ePub format that you can use to test: http://www.esperanto.be/fel/but/e-libro.php#senpaga
You'll see that the corpse of the books show the perfect diacritics on the ĉ, ĝ, ŭ because these books have embedded fonts. But you'll see also that the diacritics do not show in the table of contents. ADE doesn't use the embedded fonts for the TOCs, but it's own fonts. The only solution as far as I know: do not use ADE programs but use e.g. EPUBReader.

pensaro1
03-07-2010, 12:13 PM
Hi again. Its great that so many people are trying to help with this .
To charleski - I tried your idea about pasting in esperanto characters. Again OK. in Sigil but not in ADE. Below is the code from the Sigil version, which is 0.1.4 by the way not beta. Is there something in here that I should alter to force ADE to display properly?
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<style type="text/css">
/*<![CDATA[*/
@namespace h "http://www.w3.org/1999/xhtml";
h1
{margin-top:4em;
margin-bottom:1,5em;
page-break-after:avoid;
font-size:2em;
font-family: Georgia, Garamond, serif;
text-align: center;
color:#000000
}
h2
{margin-top:3em;
margin-right:0pt;
margin-bottom:1,5em;
margin-left:0pt;
page-break-after:avoid;
font-size:1,5em;
font-family: Georgia, Garamond, serif;
text-align: center;
color:#000000;
}
h3
{margin-top:2em;
margin-bottom:1,5em;
line-height:normal;
page-break-after:avoid;
font-size:1,25em;
font-family: Georgia, Garamond, serif;
text-align: center;
color:#000000;
}
p
{margin:0pt;
text-indent:1em;
font-family: Georgia, Garamond, serif;
text-align: justify;
color:#000000
}
ul
{text-indent:2em;
}
ol
{text-indent:2em;
}
a
{color:#000000;
text-decoration: underline
}

/*SG DO NOT MODIFY.
This style is used by Sigil.
It will be removed on export
along with the "sigilChapterBreak" HR tags. SG*/
hr.sigilChapterBreak {
border: none 0;
border-top: 3px double #c00;
height: 3px;
clear: both;
}
/*]]>*/
</style>
</head>

<body>
<div>
<hr />
</div>

<p>This file has been created in Open Office and converted to ePub using a plugin tool. The
Esperanto characters that follow are created using auto hotkey, eokey-unicode; Ĉ, Ĝ, Ĥ, Ĵ, Ŝ. If
you see them with their hats on they are being read properly. If not maybe its a reader
problem.</p>

<p><br /></p>
</body>
</html>

To paulpeer, thanks for your suggestion. I will try it. Al paulpeer, dankon. Mi provos ĝin.
:dizzy: :sleepy:

paulpeer
03-07-2010, 02:18 PM
It's a bit more complicated than you describe above, Pensaro. Did you mention the fonts in the OPF package? And in the CSS file? I suggest that you read this article:
http://blog.threepress.org/2009/09/16/how-to-embed-fonts-in-epub-files/
In my opinion it gives a clear overview of all the steps you should follow.

charleski
03-07-2010, 06:24 PM
Read the post paulpeer linked, and there's a sticky thread about embedding fonts in this forum as well.

Basically, you need to do 4 things:
1) Insert your fonts into the epub file. Generally it's best to make a 'fonts' directory and put them there. I use WinRaR to open epub files for this, but you can use the free 7zip just as well.
2) Specify the location and type of the font with an @font-face declaration in the css:
e.g.:@font-face {
font-family: "Charis";
font-style: normal;
font-weight: normal;
src:url(../fonts/font004.ttf);
}
The location is relative to the css file in the epub structure. You need to do this separately for each font you're embedding, and italics and bold count as different fonts. Often embedding a font means embedding 4 different ttf files, each with their own @font-face declaration: normal, italic, bold and bolditalic. These 4 fonts then comprise a 'font family' and should be given the same 'font-family' name ('Charis' in the example above).
3) Tell the program that you want to use these fonts by adding the following: body {
font-family: "Charis"
}Where 'Charis' is the font-family name used in the font declarations above. If your css already has a body entry, then just add the font-family line to that.
4) Tell the epub reader that the epub contains font files by opening up the .opf file that defines the epub and inserting the proper lines in the manifest section:
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookID" version="2.0">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
...
</metadata>
<manifest>
...
<item href="fonts/font001.TTF" id="CharisBold" media-type="application/octet-stream"/>
<item href="fonts/font002.TTF" id="CharisItalic" media-type="application/octet-stream"/>
<item href="fonts/font003.TTF" id="CharisBoldItalic" media-type="application/octet-stream"/>
<item href="fonts/font004.TTF" id="Charis" media-type="application/octet-stream"/>
</manifest>
...
For each item, set the href to the path to your font file (relative to the .opf file). The name used for the id doesn't really matter, it's required, but won't be used.

If this process is daunting, then I'd really advise you to get Atlantis Word Processor. It only costs $35 and can embed fonts automatically (all you need to do is check a box on the output dialog). Alternatively, you could try the 0.2.0 beta of Sigil, though that's still being refined.

[Edit] BTW, the reason you find that the letters show up in other readers is that they 'cheat' and use system fonts to render the text. ADE doesn't - it uses its own built-in font so that it can emulate what the epub looks like on a portable reader that uses ADE.

pensaro1
03-08-2010, 07:35 AM
Hello again. I must say that I am just about overwhelmed by everyone's willingness to help me with this. I am someone who lives on the outer fringes of codeworld so I get lost sometimes. Nevertheless I will persist. I copied the relevant info to my computer to study when I have time. Not much at the moment. It may be a while before I get back here but thanks again to everyone. I know this will be solved eventually.:)

Valloric
03-08-2010, 04:13 PM
BTW, the reason you find that the letters show up in other readers is that they 'cheat' and use system fonts to render the text. ADE doesn't - it uses its own built-in font so that it can emulate what the epub looks like on a portable reader that uses ADE.

Cheat? Automatic and per-character font substitution is one absolutely amazing technology. I've never heard anyone refer to is as "cheating", but I've heard many refer to it as "brilliant".

ADE should most definitely use it.

charleski
03-08-2010, 04:51 PM
Cheat? Automatic and per-character font substitution is one absolutely amazing technology. I've never heard anyone refer to is as "cheating", but I've heard many refer to it as "brilliant".

ADE should most definitely use it.
No, that would be an extremely bad idea.

ADE's desktop app roughly mirrors the result that will be obtained when reading the same book on a portable reader using ADE (which is still the most common reading system on portable devices). While final proofing should always take place on a target device to check element positioning and paging, the desktop app is extremely useful for soft-proofing during the design process.

Desktop epub readers that rely on system fonts may look nice (sometimes, I've seen some really dodgy font rendering), but this 'feature' makes them worthless for getting an idea of how the book will look on a reader. It's only 'brilliant' if you plan on reading it on your PC...

paulpeer
03-09-2010, 03:54 AM
The points of view of Charleski and Valloric are very interesting. I was wondering about the sentence "ADE (which is still the most common reading system on portable devices)." Has anyone ever seen a list of devices that do not use ADE? This would be very interesting for people that want to buy a device, and are planning to read books in a language that ADE has problems with (not only Esperanto as in this thread, but also Polish, Rumanian, Latvian etc)

pensaro1
03-09-2010, 05:34 AM
Hello again. I am back earlier than I thought because I have decided to try out Atlantis. It does the job well and it saves me a lot of time and headache at very little price. Selecting for embedded fonts means Esperanto characters do show in ADE. Problem solved but thanks very, very much to all who have helped on this.:thanks: :thumbsup: :)

paulpeer
03-09-2010, 07:20 AM
Selecting for embedded fonts means Esperanto characters do show in ADE.
I'm glad it's OK now. Can I ask two more questions? Which device are you using? And secondly, please have a look at the table of contents. Do the accented letters show well in it?
I'm asking because I'm writing an article for Monato about reading devices for people who want to read books in Esperanto.

Valloric
03-09-2010, 03:05 PM
No, that would be an extremely bad idea.

ADE's desktop app roughly mirrors the result that will be obtained when reading the same book on a portable reader using ADE (which is still the most common reading system on portable devices). While final proofing should always take place on a target device to check element positioning and paging, the desktop app is extremely useful for soft-proofing during the design process.

Desktop epub readers that rely on system fonts may look nice (sometimes, I've seen some really dodgy font rendering), but this 'feature' makes them worthless for getting an idea of how the book will look on a reader. It's only 'brilliant' if you plan on reading it on your PC...

What you're asking for is a reference rendering. For that, sure, ADE should closely exactly match what you'll see on an embedded Reading System.

But font substitution would be brilliant for a PC reader who doesn't care about the portable reading devices. I'm sure you're not dismissing the technology from the point of view of a reader. Font substitution is bad for reference rendering, but great for user-oriented rendering.

Basically, ADE should have a "use font substitution" option.

DaleDe
03-09-2010, 03:12 PM
Actually ADE does have a use font option, sort of. ADE is not a drop in program, it is implemented as an API. It has support for a user defined CSS that will override the one in the Book and this can be used to support user fonts and for other things. However, the people implementing ADE on their hardware have failed to provide this. They should offer options for the user and then write the correct CSS for ADE to see.

Dale

pensaro1
03-12-2010, 05:32 AM
I'm glad it's OK now. Can I ask two more questions? Which device are you using? And secondly, please have a look at the table of contents. Do the accented letters show well in it?
I'm asking because I'm writing an article for Monato about reading devices for people who want to read books in Esperanto.

Hi there paulpeer. Sorry to be slow coming back. Actually I don't have any device except my laptop because I am basically just experimenting and investigating possibilities. The readers I have are ADE, FBReader, Sigil and I'm not sure if Calibre counts as a reader. The accented letters do seem to show OK in t.o.c. I hope this helps. :)