View Single Post
Old 08-20-2008, 11:01 AM   #1
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
  lost when converting to mobi

Hello,

I'm using html2mobi to create mobipocket files. In French typography it's customary to have lots of spaces around punctuation (before colons, inside quotation marks, etc.). These spaces should ideally be thin non-breaking spaces, but since mobipocket apparently does not support the   entity, I've decided to use normal non-breaking space (&nbsp instead. However, it seems html2mobi converts some   into normal spaces, so that I get linebreaks in wrong places in the mobi file with an ebook reader.

This is an example HTML file

Code:
<HTML>
<HEAD>
</HEAD>
<BODY>

<DIV HEIGHT="2em">
Je demande pardon aux enfants d’avoir d&eacute;di&eacute; ce livre
&agrave; une grande personne. J’ai une excuse s&eacute;rieuse&nbsp;: cette
grande personne est le meilleur ami que j’ai au monde. J’ai une
autre excuse&nbsp;: cette grande personne peut tout comprendre, m&ecirc;me les
livres pour enfants. J’ai une troisi&egrave;me excuse&nbsp;: cette grande
personne habite la France o&ugrave; elle a faim et froid. Elle a besoin
d’&ecirc;tre consol&eacute;e. Si toutes ces excuses ne suffisent pas, je
veux bien d&eacute;dier ce livre &agrave; l’enfant qu’a
&eacute;t&eacute; autrefois cette grande personne. Toutes les grandes personnes
ont d’abord &eacute;t&eacute; des enfants. (Mais peu d’entre elles
s’en souviennent.) Je corrige donc ma d&eacute;dicace&nbsp;:
</DIV>

<P HEIGHT="1em">On disait dans le livre&nbsp;: &laquo;&nbsp;Les serpents boas
avalent leur proie tout enti&egrave;re, sans la m&acirc;cher. Ensuite ils ne
peuvent plus bouger et ils dorment pendant les six mois de leur
digestion&nbsp;&raquo;.</P>

<P>J’ai alors beaucoup r&eacute;fl&eacute;chi sur les aventures de la
jungle et, &agrave; mon tour, j’ai r&eacute;ussi, avec un crayon de
couleur, &agrave; tracer mon premier dessin. Mon dessin num&eacute;ro 1. Il
&eacute;tait comme &ccedil;a&nbsp;:</P>

</BODY>

</HTML>
If I convert the HTML to mobi and then back to HTML (with mobi2html), only the &nbsp; in the middle paragraph are conserved, the others are turned into normal spaces.

Is this a known problem? Is there a workaround? Is it possible to fix that?

P.S. I'm using mobiperl 0.38 under linux (perl v5.8.6).
Jellby is offline   Reply With Quote