MobileRead Forums - View Single Post

valex · 04-25-2011, 09:47 AM

Quote:

Originally Posted by user_none

It would be possible to use a heuristic to detect [number] is a footnote but I have no plans to do this.

I ended up doing exactly that. I use the following script to post-process fb2 file created by Calibre. It converts the [number]s to correct links and creates <body name="notes"> section with the footnotes' texts by applying three regular expressions to the file sequentially. The limitation is it does not support multi-paragraph footnotes and footnotes within footnotes but those are rare. My knowledge of Python is rather rudimentary. Is it possible to configure Calibre to call the script automatically after the conversion to fb2 is done? Is the Calibre's functionality allowing application of regexes limited to internal xhtml representation?

Code:

#!/usr/bin/python

import os
import re
import sys

def file_replace(fname, out_fname, regex, repl):
    tmp_fname = fname + ".tmp"
    out = open(tmp_fname, "w")

    for line in open(fname):
        out.write(re.sub(regex, repl, line))

    out.close()
    if os.path.exists(out_fname):
        os.remove(out_fname)    
    os.rename(tmp_fname, out_fname)


if len(sys.argv) != 3:
    u = "Usage: file_replace <file_name.fb2> <file_name_fixed.fb2>\n"
    sys.stderr.write(u)
    sys.exit(1)

file_replace(sys.argv[1], sys.argv[2], r'<p>\[1\]', r'</section></body><body name="notes"><section><p>[1]')
file_replace(sys.argv[2], sys.argv[2], r'<p>\[(\d+)\](.+)</p>', r'</section><section id="n_\1"><title><p>\1</p></title><p>\2</p>')
file_replace(sys.argv[2], sys.argv[2], r'\[(\d+)\]', r'<a xlink:href="#n_\1" type="note">[\1]</a>')