View Single Post
Old 08-13-2014, 01:51 AM   #20
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BetterRed View Post
I think member Tex2002ans also has some experience with LaTeX.
Heh, I just stumbled upon this topic while I was searching for some comparison images of me showing off LaTeX. Thanks for mentioning my name, I never venture into this section of MobileRead!

Side Note: This was the post I was looking for! https://www.mobileread.com/forums/sho...6&postcount=58

Front (LaTeX/Formula) Matter

I wouldn't say I am fantastic at LaTeX, but I have been dabbling enough to create some pretty dang good fiction PDFs + (decent) non-fiction. I still have a lot of tweaks to iron out though... so I wouldn't call any of the PDF stuff I work on production ready.

When I first started, I used LyX, which seems to be an ok introduction. I didn't find it any harder than fiddling around/learning with Sigil. I then read The Not so Short Introduction to LaTeX. (Now I just would type everything directly in LaTeX. The GUI stuff felt too limiting.).

Another fantastic resource is the TeX Stack Exchange: https://tex.stackexchange.com/

and the LaTeX Wiki: https://en.wikibooks.org/wiki/LaTeX/

And let me give credit where credit is due. It was user Jellby who got me interested in this LaTeX stuff, and pointed me to lots of resources.

Getting nice formulas into EPUBs is definitely the area where I am quite interested in. I mean, I wrote a Formula -> PNG tutorial: https://www.mobileread.com/forums/sho...d.php?t=223254

Plus I typed out a few tiny posts tomes discussing OCRing math books: https://www.mobileread.com/forums/sho...d.php?t=228413

I recently just digitized ~81 more equations using LaTeX using the CodeCogs site that Toxaris mentioned in my Tutorial. I was able to use CodeCogs to recreate the formula in LaTeX, and then export as a PDF. I was then able to feed it into a .bat file to process all of the formulas using Imagemagick. (See attached ZIP file for the images, and further down in the post for a few samples).

I still have to iron out the workflow, and will probably tack all that info into my Tutorial. Perhaps as a more "Advanced" method.

Meat and Potatoes

Anyway, the way I see it, there are 2 different versions of formulas:
  • Simple
    • Can be reduced to basic math
    • Keep this in HTML (?)
    • 2 + 2 = 4
    • y = mx + b
    • dx/dy = 5^(2/3)z + (5x)/3
  • Complex
    • Can't be replicated easily using basic math symbols + parenthesis.
    • Integrals/Summations
    • Complex equations in large fractions
    • Large Square Roots
    • Fractions over fractions
    • Complex math symbols
      • x hat
      • x with arrow above
      • x with dot above
      • Symbols with a superscript + subscript at the same time
    • Large + Small Parenthesis/Brackets
    • ...

Simple

You should be able to keep those as normal HTML. This will allow it to scale up/follow with the user's settings. This also allows them to be easily searchable/copy/pastable.

I would just do something along these lines:

Code:
<div class="formula">
<p class="math"><span class="math">∂[U(φ).V(x)] → U(φ). ∂V(x),</span></p>
</div>
Embed a font, and in the CSS, just do something along these lines:

Code:
div.formula {
	margin-top: 1em;
	margin-bottom: 1em;
	page-break: avoid;
	text-indent: 0;
	text-align: center;
}

p.math {
	text-indent: 0;
	text-align: center;
}

span.math {
	font-family: 'CharisSIL', serif;
}
Many people here on MobileRead use a font called CharisSIL: http://scripts.sil.org/cms/scripts/p...=charissilfont

Then you can subset the font using Calibre, which will try to minimize the size it adds to the EPUB. And if you use <span class="math"> sparingly, it shouldn't effect the look of the rest of your document.

Side Note: You may also want to use <span class="math"> in any inline equations as well.

I also do something quite similar with a lot of the more obscure Greek accented words:

Quote:
<p><a href="#ft6" id="fn6">[6]</a> The English <i>money</i>, the Spanish <i>moneda</i>, the Portuguese <i>moeda</i>, the French <i>monnaie</i>, the Hebrew <i>maoth</i>, the Arabic <i>fulus</i>, the Greek <span class="greek" xml:lang="grc">νόμισμα</span>, &amp;c.</p>
Or there was one book where I had the author's names in their original Chinese. I had to embed a font including the Chinese characters so it would display on ereaders.

Complex

There is no other way to handle these that would work across all devices besides an image of the formula (PNG or GIF). In this case, I would go for consistency, and just convert ALL formulas to images. It looks a little odd if you would have Complex formulas as images, while the Simple ones were normal HTML. Although that is a decision left up to you.

Side Note: Ok I lied, maybe not ALL formulas. Those formulas that are in the normal flow of the text, I would try to keep/reduce those to their "Simple" equivalents. Inline PNGs/GIFs of formulas are HORRIBLE, and really mess with the flow of the document.

I attached a ZIP file of 81 formulas at 350dpi I just generated using LaTeX (using the CodeCogs GUI) + Imagemagick.

The same sort of thing can be accomplished using my Tutorial.... although I am now seeing the superiority of LaTeX in this case, because I can shove LaTeX right in the alt attribute.

This makes it super easy for me to plop it back into LaTeX, do any corrections, and generate a new formula if needed. Or in the future, quickly converting the image into a more PROPER formula (SVG, MathML, straight LaTeX), or being able to easily update math fonts/kerning/typography at the flip of a switch.

This is what I settled on, with the LaTeX used to get the formula in RED:

Quote:
<div class="formula">
<div class="image"><img alt="\frac{\displaystyle\frac{\text{widgets} / \text{elapsed-time}}{[\text{widgets} \times (\text{elapsed-time})^{(\alpha-1)}] \cdot [\text{labor-hours} / \text{elapsed-time}]^\phi}}{[\text{labor-hours}]^{\alpha\theta} \cdot [\text{labor-effort}]^{\alpha(1-\theta)}}" src="../Images/QJAE8.4.4-pg055-Formula3a.png" /></div>
</div>
Click image for larger version

Name:	QJAE8.4.4-pg055-Formula3a.png
Views:	363
Size:	11.6 KB
ID:	126771

Quote:
<div class="formula">
<div class="image"><img alt="I = \frac{C}{1+i}\frac{1}{\left(1-\frac{1-a}{1+i}\right)} = \frac{C}{i+a}" src="../Images/QJAE10.3.1-pg200-2.png" /></div>
</div>
Click image for larger version

Name:	QJAE10.3.1-pg200-2.png
Views:	367
Size:	2.4 KB
ID:	126772

Quote:
<div class="formula">
<div class="image"><img alt="\forall n \geq 2,i_n = \frac{I_{\text{KG}(n-1)}-I_n}{I_n} = \frac{(1-a)I_{n-1}}{I_n} = i" src="../Images/QJAE10.3.1-pg203-3.png" /></div>
</div>
Click image for larger version

Name:	QJAE10.3.1-pg203-3.png
Views:	360
Size:	3.9 KB
ID:	126773

Quote:
<div class="formula">
<div class="image"><img alt="\pi_t^a = \gamma\pi_{t-1} + (1-\gamma)\pi_{t-1}^{a}" src="../Images/QJAE11.2.2-pg102-Formula10.png" /></div>
</div>
Click image for larger version

Name:	QJAE11.2.2-pg102-Formula10.png
Views:	375
Size:	2.2 KB
ID:	126774

Quote:
Originally Posted by phossler View Post
On my Kindle's Font's menu, there is a [Publisher Font] option that is available IF there are embedded fonts in the document.

The Nook might be the same.
Kindles probably support more obscure/unicode characters than the Nook... in my testing, Nook typically sticks with ASCII + a very small subset of accented chars. WHY these devices don't have at least a few fonts which cover much more of the Unicode spectrum makes zero sense to me.

Probably memory restrictions.

So my recommendation would be:
  • Images (PNG or GIF)
    • Pro: Maximum Compatability
    • Pro: Will have fancy typography
    • Pro: Will display the formulas as you want them
    • Pro: Even works on the ancient Kindles
    • Con: Time Consuming to type in the equations
    • Con: Not Searchable/Scalable/Copy/Pastable
    • Con: Since the images are bitmap, they don't scale Up/Down well
    • Con: Doesn't follow user preferences
    • Con: Can bloat the filesize.
    • Con: In the future, the higher the DPI devices get, the smaller/worse these images will appear.
  • HTML + Embedded font
    • Pro: Should work pretty well across devices
      • Most of the books I work on have "Simple" equations, so I lean towards preferring HTML, even if it is not very typographically beautiful.
    • Pro: Smaller effect on filesize
    • Con: Depending on how complex, your equations might look hideous, or become quite hard to understand (because of mutli-layered parenthesis)
    • Con: Even if you DO embed a font, sometimes those devices default to Publisher Font OFF.
      • There is not much you can do about this besides hoping your readers are informed on how to use their device. (Maybe mentioning something about turning Publisher Fonts ON in the beginning of the book, and hope the reader listens).
  • Just keep it as HTML, no embedded font
    • Hope for the best, and people would be reading on a device with the characters in their font

In the future, MathML seems like it would be a decent choice, although it won't be a good choice until long into the future.... because barely any reading devices support it. (You have to know your customers, and who this book will be geared towards. If you knew everyone would read it in Calibre, AZARDI, iBooks (?) then MathML would probably be a great choice, although if everyone will be reading it on Kindles, Nooks, Kobos, etc. etc. MathML is a non-choice.).

SVG would also be able to generate crisper/cleaner formulas than PNG/GIF, and would DEFINITELY be superior to the HTML versions. Although SVG support on devices isn't the greatest either, and you STILL have to do all the work to generate a PNG/GIF fallback.

We hashed out a lot of SVG + inline SVG stuff in this topic: https://www.mobileread.com/forums/sho...d.php?t=222825

Quote:
Originally Posted by elearner View Post
Also, not sure what publisher font is. The Word document is an original document wrtten on Word.

But I guess what you are saying it that you wouldn't start from here! However, this is a 120 page document (written by my partner) which we would like to turn into an e-book. There are a large number of cross-references, it would be very difficult to start all over again.
My advice if you are going to be writing scientific literature? Abandon Word ASAP. The sooner you get out of it and into LaTeX, the better. You will save yourself MANY headaches.

"So many cross-references [...] difficult to start all over again"? Sounds to me like you are causing your own Word headaches already!



Can you do this in Word? (See topic, your head will EXPLODE): https://tex.stackexchange.com/questi...tures-show-off

Edit: Also, I had trouble Copying/Pasting equations out of your Word DOC. The symbols were completely mangled. I saved your DOC file in Word, and exported to HTML, and saw this HORROR as the first equation (the rest are equally as abysmal):

Spoiler:
Quote:
<p class=MsoNormal style='margin-left:1.5in;text-indent:.5in'><span lang=EN-GB
style='font-family:Symbol;mso-ascii-font-family:"Times New Roman";mso-fareast-font-family:
"Arial Unicode MS";mso-hansi-font-family:"Times New Roman";color:black;
mso-ansi-language:EN-GB;mso-char-type:symbol;mso-symbol-font-family:Symbol'><span
style='mso-char-type:symbol;mso-symbol-font-family:Symbol'>¶</span>
</span><span
lang=EN-GB style='mso-fareast-font-family:"Arial Unicode MS";color:black;
mso-ansi-language:EN-GB'>[<span class=GramE>U(</span></span><span lang=EN-GB
style='font-family:Symbol;mso-ascii-font-family:"Times New Roman";mso-fareast-font-family:
"Arial Unicode MS";mso-hansi-font-family:"Times New Roman";color:black;
mso-ansi-language:EN-GB;mso-char-type:symbol;mso-symbol-font-family:Symbol'><span
style='mso-char-type:symbol;mso-symbol-font-family:Symbol'>j</span></span><span
lang=EN-GB style='mso-fareast-font-family:"Arial Unicode MS";color:black;
mso-ansi-language:EN-GB'>).V(x)] </span><span lang=EN-GB style='font-family:
Symbol;mso-ascii-font-family:"Times New Roman";mso-fareast-font-family:"Arial Unicode MS";
mso-hansi-font-family:"Times New Roman";color:black;mso-ansi-language:EN-GB;
mso-char-type:symbol;mso-symbol-font-family:Symbol'><span style='mso-char-type:
symbol;mso-symbol-font-family:Symbol'>®</span></span><span lang=EN-GB
style='mso-fareast-font-family:"Arial Unicode MS";color:black;mso-ansi-language:
EN-GB'> <span class=GramE>U(</span></span><span lang=EN-GB style='font-family:
Symbol;mso-ascii-font-family:"Times New Roman";mso-fareast-font-family:"Arial Unicode MS";
mso-hansi-font-family:"Times New Roman";color:black;mso-ansi-language:EN-GB;
mso-char-type:symbol;mso-symbol-font-family:Symbol'><span style='mso-char-type:
symbol;mso-symbol-font-family:Symbol'>j</span></span><span lang=EN-GB
style='mso-fareast-font-family:"Arial Unicode MS";color:black;mso-ansi-language:
EN-GB'>). </span><span lang=EN-GB style='font-family:Symbol;mso-ascii-font-family:
"Times New Roman";mso-fareast-font-family:"Arial Unicode MS";mso-hansi-font-family:
"Times New Roman";color:black;mso-ansi-language:EN-GB;mso-char-type:symbol;
mso-symbol-font-family:Symbol'><span style='mso-char-type:symbol;mso-symbol-font-family:
Symbol'>¶</span>
</span><span lang=EN-GB style='mso-fareast-font-family:"Arial Unicode MS";
color:black;mso-ansi-language:EN-GB'>V(x),<o></o></span></p>


You can see in RED that the pilcrow '¶' was overwritten to display as a partial derivative '∂'. (The rest of the symbols were just as bad).

This horror should be a much closer to what I posted above (I manually recreated this):

Quote:
<p class="math"><span class="math">∂[U(φ).V(x)] → U(φ). ∂V(x),</span></p>
If you are still going to be using Word, I would recommend learning how to use Styles to create cleaner documents.

ALSO, I would avoid using whatever you used to insert Symbols. As you can see above, it seems as if Microsoft Word inserted them as a dreaded "Symbols" font (seems to only work in Microsoft Word and IE?). It looks to me like it is overriding the letters "a-z" + "A-Z" with the lower/uppercase Greek symbols. Instead, you would want the PROPER unicode character.

This is also just going to cause tons of headaches for you when trying to move this document ANYWHERE besides Word. (Like me trying to copy/paste info into the forums, or convert to EPUB).
Attached Files
File Type: zip Tex QJAE Formulas[08.13.2014].zip (337.9 KB, 295 views)

Last edited by Tex2002ans; 08-13-2014 at 06:46 AM.
Tex2002ans is offline   Reply With Quote