Hi everyone.
I've learned quite a few things on this forum, so I thought I'd give something back by sharing some trivial tricks I know. I hope this is the right section and I hope this is worth of a thread of its own. If it is not, I apologize.
EDIT: I just found out two more ways to do this.
1) You get to a point where you have an EPUB with two html files: on one is the text with the numbers which refer to the footnotes, each signaled by a string you don't find anywhere else, e.g.**1, **2 ecc; on the other one you have the footnotes, each preceded by its progressive numer:
At this point with two fairly simple regular expressions you can interlink between text and footnotes. The cool thing is, you do it in one stroke. Fast and painless. This is especially easy if the printed book had all the notes stacked together at the end of the book or of each chapter. The regexs should be something like this.
Spoiler:
Code:
FIND: \*\*([0-9]{1,})
REPLACE: <a href="../Text/footnotes.xhtml#n\1-n" id="n\1"><sup>\1</sup></a>
FIND: <p>([0-9]{1,})\.
REPLACE: <p><a href="../Text/Text.xhtml#n\1" id="n\1-n">\1</a>.
Or something along these lines. There's some fine tuning to do but the basic stuff should be there. Please notice that the IDs
can't start with numbers
2) When editing the book in Finereader, you copy all the notes
within the text, preceded and followed by the usual unique string:
EDIT At this point you can use the ALTSEARCH add-on for open office (how much time I wasted for not knowing this...
)
Code:
Find: \*\*(.*)\*\*
Replace: \F{\1}
And it inserts the text directly as footnote (though it seems that it looses the styling).
Then, from open office, with a very simple regex \*\*(.*?)\*\* you find all the strings included between the two pairs of asterisks.
Cut/insert/footnote/paste/repeat.
This works faster than the method I first used because there is only one ODT file involved, and the autoclick software I use (PTFB) likes this much better. Basically, once you have arranged the file, you just need to run the PTFB macro as many times as needed. One click per footnote. Also, this way you don't have to cut/paste all the notes to an external file while working with AFR, and that saves time too.
(This last regex works only if the notes are on one paragraph only. I still haven't found a way to search across multiple lines in OO. In case, just replace all the linebreaks with another unique string, do the job, and replace back)
Enjoy
What I'm talking about: Creating ebooks out of printed books, or images, with lots of footnotes.
What file formats do I get: From my experience this works with epub, pdf, mobi, azw3, kepub, doc, odt... you name it.
What is the end result: clickable progressive superscript numbers which link each to its footnote at the end of the book (or possibily of the chapter), each note with its "back link" to its number. Pretty std stuff.
What software do I use: Abbyy Finereader (AFR), Open Office (OO) with writer2epub (W2E) and perfect epub, sigil - nothing fancy here - and PTFB, an auto-click sw which has saved me a lot of time. And I mean
a lot.
I'm on windows, no idea how this works elsewhere.
How it's done: When I correct the ocr in AFR I replace all the numbers which refer to a footnote with some symbols or sequence of symbols that are not found anywhere else in the book, followed by the number of the note: **01, **02,..., **10. The numbers should all have the same number of digits because this greatly facilitates the auto-click process (more on this below).
At the same time I cut and paste the actual footnotes in a separate ODT file, making sure that at least some of them show the number before the first word, so as to avoid mistakes when pasting them back (for some reason when I C/P from AFR to OO said number usually disappears).
When the correction on AFR is complete and I have a decent ODT file, I copy all the notes at the bottom of that file and I once again correct it, with perfectepub and whatnot. I copy the notes before correcting the file because, of course, they usually show more or less the same errors as the main text.
Then I copy the corrected notes into yet another ODT file, I open the two files in two separate big windows, I open the search function looking for '**' and I place the whole thing like this:
[Image violates guidelines for size - MODERATOR]
I cut the first note from the footnotes' file, launch PTFB, click on "new window macro", click on the main text file, add the two numbers to the selction with shift+rightkey twice (this can be done w/ a regex too), erase the selection (the two * stars and the two-digit number), I click on insert/footnote/ok, type one space and paste the note, type ctrl+home to get to the start of the note, I click on the note's number to head back to the text, and click "find". Then I stop the dialog window of PTFB, automatically save the sequence, and assign a hotkey to it.
(When clicking ctrl+home the first line of the note only appears always at the same height if the note is high enough. Thus I usually hit "enter" some 7-8 times after pasting the note, to facilitate the process. These sequences of empty paragraphs can be easily erased from the epub, and if you want to erase them from the odt as well you can save the file as HTML, work on the code, and save it back as odt)
So now I just have to select one footnote at a time from their file, ctrl+x, hotkey, and repeat til it's over. I still have to pay attention to the numbers, because many times I might skip one note, but nevertheless when I'm dealing with hundreds of notes this process, as I said, saves me SO MUCH time.
The end result is an ODT file with clickable notes. These can be displayed either at the end of each page or all together at the end of the document (tools/footnotes). They work when exporting to pdf (except for the "back-link" on the note AFAIK), and they work when converted to EPUB with W2E. In the epub they end up in a single xhtml file. They are already styled, each note within a div with class "footnote" and a <p> with class "fnparagraph". You can of course style them more - for example, the default width of the div is 60%, which is too little imho. The whole thing still works when converting to MOBI/AZW3/KEPUB with calibre.
I have never tried to display the notes at the end of each chapter, but this could be done by creating a separate ODT file for each chapter, saving each file as EPUB, and merging the xhtml files together. You'd have to rename the files and the anchors, not impossible but maybe not worth it either.
Anyway, there you go. Hope this all makes sense, if you have questions or suggestions let me know.
1v