View Single Post
Old 12-23-2014, 02:46 AM   #8
dmonasse
Member
dmonasse began at the beginning.
 
Posts: 23
Karma: 10
Join Date: Apr 2014
Location: Paris
Device: ipad 2, Ubuntu
A regex function to number a (mathematical) ebook

The search and replace tool with regex function is really fantastic. My little society is building mathematical ebooks from latex sources. One of my problems for converting such books is that latex auto-numbers chapters, sections, subsections and theorem-like assertions (theorems, propositions, lemmas, definitions, corollaries and so on). I would like to do such a numbering in my ebook.

A solution is the following:

1) Converting from latex, I put chapters, sections, subsections and assertions in a <div> tag with a html5 data-type attribute. For example, a latex section
Code:
\section{History of the Fermat-Wiles theorem}
is converted into
Code:
<div class="section" data-type="section">History of the Fermat-Wiles theorem</div>
and
Code:
\begin{theorem}Abracadabra\end{theorem}
is converted into
Code:
<div class="theorem" data-type="theorem">Abracadabra</div>
Nota: I can't use the class attribute to denote the type of the div because the conversion process from HTML to ePub by Calibre modifies these attributes and class="theorem" may be changed into class="pcalibre25". That's the reason for the data-type attribute.

2) After conversion from latex to html (not so easy!!!) and from html to epub (easy with Calibre), I number the whole book with the Calibre editor using the search and replace tool with regex function.
The search pattern I use is:
Code:
<div.*?data-type="(chapter|section|subsection|theorem|proposition|lemma|definition|corollary)"[^>]*>
and the regex function may be:
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    if number==1: #initialization of the counts
        data['chapter']=0
        data['section']=0
        data['subsection']=0
        data['assertion']=0
    the_type=match.group(1)
    if the_type=='chapter': # begins a chapter, reinitialize the counts
        data['section']=0
        data['subsection']=0
        data['assertion']=0
        data['chapter']+=1
        return match.group()+"<span class='chapter_num'>Chapter "+str(data['chapter'])+".</span> "
    elif the_type=='section': # begins a section, reinitialize the subsection count
        data['subsection']=0
        data['section']+=1
        return match.group()+"<span class='section_num'>Section "+str(data['section'])+".</span>" 
    elif the_type=='subsection':
        data['subsection']+=1
        return match.group()+"<span class='subsection_num'>Subsection "+str(data['section'])+"."+str(data['subsection'])+".</span>"
    else: # this is an assertion
        data['assertion']+=1
        return match.group()+"<span class='assertion_num'>Assertion "+str(data['chapter'])+"."+str(data['assertion'])+".</span>"
    return ''

replace.file_order = 'spine'
Adapt the code according to your needs or wishes, this is only an example; it would be nicer to replace "Assertion" by "Theorem", "Proposition", "Lemma", "Corollary", "Definition" (very easy to do starting from the "the_type" variable). I obtain such a numbering:
Code:
Chapter 1
     Section 1
         Subsection 1.1
             Assertion 1.1
             Assertion 1.2
         Subsection 1.2
            Assertion 1.3
     Section 2
         Subsection 2.1
             Assertion 1.4
             Assertion 1.5
         Subsection 2.2
            Assertion 1.6
Chapter 2
     Section 1
         Subsection 1.1
             Assertion 2.1
             Assertion 2.2
         Subsection 1.2
            Assertion 2.3
     Section 2
         Subsection 2.1
             Assertion 2.4
             Assertion 2.5
Hope this may help. Any improvement will be welcome (even in my bad English syntax).
dmonasse is offline   Reply With Quote