![]() |
#1 |
Member
![]() Posts: 23
Karma: 10
Join Date: Apr 2014
Location: Paris
Device: ipad 2, Ubuntu
|
A regex function to number a mathematical ebook
The search and replace tool with regex function is really fantastic. My little society is building mathematical ebooks from latex sources. One of my problems for converting such books is that latex auto-numbers chapters, sections, subsections and theorem-like assertions (theorems, propositions, lemmas, definitions, corollaries and so on). I would like to do such a numbering in my ebook.
A solution is the following: 1) Converting from latex, I put chapters, sections, subsections and assertions in a <div> tag with a html5 data-type attribute. For example, a latex section Code:
\section{History of the Fermat-Wiles theorem} Code:
<div class="section" data-type="section">History of the Fermat-Wiles theorem</div> Code:
\begin{theorem}Abracadabra\end{theorem} Code:
<div class="theorem" data-type="theorem">Abracadabra</div> 2) After conversion from latex to html (not so easy!!!) and from html to epub (easy with Calibre), I number the whole book with the Calibre editor using the search and replace tool with regex function. The search pattern I use is: Code:
<div.*?data-type="(chapter|section|subsection|theorem|proposition|lemma|definition|corollary)"[^>]*> Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): if number==1: #initialization of the counts data['chapter']=0 data['section']=0 data['subsection']=0 data['assertion']=0 the_type=match.group(1) if the_type=='chapter': # begins a chapter, reinitialize the counts data['section']=0 data['subsection']=0 data['assertion']=0 data['chapter']+=1 return match.group()+"<span class='chapter_num'>Chapter "+str(data['chapter'])+".</span> " elif the_type=='section': # begins a section, reinitialize the subsection count data['subsection']=0 data['section']+=1 return match.group()+"<span class='section_num'>Section "+str(data['section'])+".</span>" elif the_type=='subsection': data['subsection']+=1 return match.group()+"<span class='subsection_num'>Subsection "+str(data['section'])+"."+str(data['subsection'])+".</span>" else: # this is an assertion data['assertion']+=1 return match.group()+"<span class='assertion_num'>Assertion "+str(data['chapter'])+"."+str(data['assertion'])+".</span>" return '' replace.file_order = 'spine' Code:
Chapter 1 Section 1 Subsection 1.1 Assertion 1.1 Assertion 1.2 Subsection 1.2 Assertion 1.3 Section 2 Subsection 2.1 Assertion 1.4 Assertion 1.5 Subsection 2.2 Assertion 1.6 Chapter 2 Section 1 Subsection 1.1 Assertion 2.1 Assertion 2.2 Subsection 1.2 Assertion 2.3 Section 2 Subsection 2.1 Assertion 2.4 Assertion 2.5 Last edited by dmonasse; 12-22-2014 at 03:11 PM. |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,355
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Cool
![]() |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Dead account. Bye
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
|
![]() Great usage example. Maybe you could port and post it in the saved searches sticky thread. It could be, no, it will be, a great addition. Nevertheless: Are you sure? I haven't tested with an HTML to epub conversion, but in an epub to epub conversion, "calibreXX" classes only appear when the original element has no given class in the source file. I've just made a quick test and I'm seeing preserved <p class="salto1">, <blockquote class="asangre"> or my own <span class="nw">. |
![]() |
![]() |
![]() |
#4 | |
Member
![]() Posts: 23
Karma: 10
Join Date: Apr 2014
Location: Paris
Device: ipad 2, Ubuntu
|
Quote:
I made a copy of this post, as suggested, in the saved searches sticky thread Thanks for your encouragements and many thanks to Kovid for Calibre. |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex Function - Split unknown word | Paulie_D | Editor | 19 | 12-07-2014 05:12 AM |
Advanced search within ebook using application or regex | Earthlark | Calibre | 3 | 02-04-2014 03:33 AM |
Regex Help: Find page number & Replace+Remove 2x Line Breaks in Sigil | Contre-jour | Sigil | 9 | 02-01-2013 10:47 AM |
Do the number of pages in an ebook differ from the number of pages in a physical book | Phoebemy | General Discussions | 12 | 07-19-2012 09:25 AM |
Texet EZB890 network eBook function | thcrw739 | Alternative Devices | 10 | 03-29-2010 02:03 PM |