View Full Version : 181.000 Footnotes


sebito
04-02-2012, 12:58 PM
Hi all :help:,

I have a serious problem, I am developing epub conversion to a Catholic Bible.

The base file from which you started to work is an RTF styles and footnotes have already created. In the RTF (seeing it from atlantiswordprocessor) I can scroll through the notes, go round but I can not do the same back. The same happens when I try to do from the HTML.

eg

In the beginning of everything, God created <a id="a65"> </ a> {<span <a href="notes.html#a21105"> class="t30"> a </ span>} </ a > heaven and earth.

and note I have :bulb2: :

<a id="a21105"> </ a> a </ span> <span class="t72"> 1.1 </ span> <span class="t73"> created :: ...

I need to know and from the Atlantis I make those notes that have already established link back, or automated the process from the html. Given that approximately 181,000 are footnotes.

PS: Sorry for the English, but I do not know much Latin and English.
Translation: Google translator :)

mmat1
04-02-2012, 01:45 PM
I have a serious problem, I am developing epub conversion to a Catholic Bible.

In the beginning of everything, God created <a id="a65"> </ a> {<span <a href="notes.html#a21105"> class="t30"> a </ span>} </ a > heaven and earth.

and note I have :bulb2: :

<a id="a21105"> </ a> a </ span> <span class="t72"> 1.1 </ span> <span class="t73"> created :: ...


Try to remove the blanks after each /

sebito
04-02-2012, 01:55 PM
ok, but that's not the problem, the problem is the creation of links back of 181,000 footnotes. I need an automatic way to do it.
:help:

mmat1
04-02-2012, 02:14 PM
ok, but that's not the problem, the problem is the creation of links back of 181,000 footnotes. I need an automatic way to do it.
:help:

First of all: Footnote and Text-Anchor should have the same number. Your code should read:

In the beginning of everything, God created <a id="a65"> </a> <a href="notes.html#n65"> hyperlinktext</a>

and then note:

<p><a id="n65"></a>Genesis 1.1</p>


In your example, the number of the note is 21105 and the number of the text to jump back is 65. How should a programm guess, that the backreference of 21105 is 65 ???

If the references were build as shown in my example, it would be just a question of a simple regex to build the backlink, but in this case i'm nearly out of ideas .....

sebito
04-02-2012, 02:21 PM
it is true what you say, but unfortunately the export of RFT - ePub, the atlantiswordprocessor generates these numbers and I can not control it.

mmat1
04-02-2012, 02:25 PM
it is true what you say, but unfortunately the export of RFT - ePub, the atlantiswordprocessor generates these numbers and I can not control it.

Are the footnotes in the same order than the text which refers to them ?

sebito
04-02-2012, 02:36 PM
is in order:
a: a = id65 note: 21105
b: b = id66 note: 21106
etc ...

mmat1
04-02-2012, 03:03 PM
is in order:
a: a = id65 note: 21105
b: b = id66 note: 21106
etc ...

Meanwhile i had some thoughts over it. I do not know a way of correction without writing some code in a programming environment (which is probably quite easy since there is a rule).

Maybe someone else does.

If there is no copyrightviolation, attach the book to your message so that we can have a look on it

sebito
04-02-2012, 03:12 PM
ok :thanks:

mmat1
04-02-2012, 03:32 PM
ok :thanks:

I guess i asked for it :) This will take some time ...

mmat1
04-02-2012, 04:15 PM
I guess i asked for it :) This will take some time ...

Back again

Here you are....

There are still some links broken, i guess i didn't get the full story >(most of the text is missing...)

I did it with sigil and regex only. Are you familiar with sigil ?

sebito
04-02-2012, 04:34 PM
:eek: OMG

You make it sound easy ...

I am familiar with sigil, but do not see the functionality at this time. I and I have tried my epub, in fact both you send HTML, are part of the full epub. Are you used regular expressions after the sigil? Do you applied to HTML?

mmat1
04-02-2012, 04:52 PM
:eek: OMG

You make it sound easy ...

I am familiar with sigil, but do not see the functionality at this time. I and I have tried my epub, in fact both you send HTML, are part of the full epub. Are you used regular expressions after the sigil? Do you applied to HTML?

OK, thats in general the strategy
1. I noticed that none of the href-values has a filename, that must be corrected fist. So i merged the 2 files and added a "../Text/015.html" to any "#a\d+?".

2. I split the two files and Sigil corrects the filenames automatically. Some of your links are pointing to an anchor within the same file. Only links which now point to notes.html will be threated in the next steps.

3. I added a "id"-attribute with the same number as the href to any link, which points to "notes.html", preceeding with "t" (within 015.html only).

4. Due to the weird formatting it get's a bit tougher in notes.html. First i replaced "<span class="tpublidisa70">&nbsp;</span>" with "&nbsp;" since i see no point to give a blank a special format and it will make the following regex easier.

5. Regex (in notes.html only)
Find: <a id="a(\d\d?\d?\d?\d?)(">)</a>(&nbsp;<span class="tpublidisa71">)<a href="../Text/Text.html#a65">(.+?)</a></span>
Replace: <a href="../Text/Text.html#t\1" id="a\1\2\3\4</span></a>
This uses your "<a href="../Text/Text.html#a65">" as endpoint (well in most cases it's just "<a href="">" and tosses it out for good.

done

----------------------------------------------------

Edit: There's no special functiony within Sigil. It's just dividing the job into small steps and usage of regex. It is easy, with a few hundred links. I guess it's still a tedious job with 181000...

sebito
04-02-2012, 06:37 PM
assistance is really rewarding. thank you very much!

ProDigit
04-14-2012, 11:32 AM
For html edits, I only use notepad++.
Their search and replace functions are far superior to anything out there, equal if not better to MS Office; definitely faster than MS Office (though I find MS Office a lot easier to use).

As far as linking back to the previous link, normally most readers have a back button.
But if not, it's going to be quite some work! It might be easier linking those footnotes back to a chapter or something.