View Single Post
Old 04-25-2011, 09:47 AM   #4
valex
Enthusiast
valex began at the beginning.
 
Posts: 25
Karma: 26
Join Date: Oct 2010
Location: IL, USA
Device: kindle 3
Quote:
Originally Posted by user_none View Post
It would be possible to use a heuristic to detect [number] is a footnote but I have no plans to do this.
I ended up doing exactly that. I use the following script to post-process fb2 file created by Calibre. It converts the [number]s to correct links and creates <body name="notes"> section with the footnotes' texts by applying three regular expressions to the file sequentially. The limitation is it does not support multi-paragraph footnotes and footnotes within footnotes but those are rare. My knowledge of Python is rather rudimentary. Is it possible to configure Calibre to call the script automatically after the conversion to fb2 is done? Is the Calibre's functionality allowing application of regexes limited to internal xhtml representation?

Code:
#!/usr/bin/python

import os
import re
import sys

def file_replace(fname, out_fname, regex, repl):
    tmp_fname = fname + ".tmp"
    out = open(tmp_fname, "w")

    for line in open(fname):
        out.write(re.sub(regex, repl, line))

    out.close()
    if os.path.exists(out_fname):
        os.remove(out_fname)    
    os.rename(tmp_fname, out_fname)


if len(sys.argv) != 3:
    u = "Usage: file_replace <file_name.fb2> <file_name_fixed.fb2>\n"
    sys.stderr.write(u)
    sys.exit(1)

file_replace(sys.argv[1], sys.argv[2], r'<p>\[1\]', r'</section></body><body name="notes"><section><p>[1]')
file_replace(sys.argv[2], sys.argv[2], r'<p>\[(\d+)\](.+)</p>', r'</section><section id="n_\1"><title><p>\1</p></title><p>\2</p>')
file_replace(sys.argv[2], sys.argv[2], r'\[(\d+)\]', r'<a xlink:href="#n_\1" type="note">[\1]</a>')
valex is offline   Reply With Quote