Quote:
Originally Posted by droopy
Is there an automated way to get the footnotes "properly constructed" (not actually sure what that means)?
|
No, no automated way... but I may be able to make it work.
Could you PM me a link to the PDF+DOCX, and I'll see what I can do.
If the file is as you say, and all the footnotes have "smaller superscript numbers" in their proper locations, I have a method which should be able to convert the majority of footnotes.
Quote:
Originally Posted by Hitch
If you have a PDF with footnotes, etc., you are almost always better off going the AbbyyFineReader route.
|
Agreed.
Finereader usually does a pretty good job at recognizing what's a footnote, and actually treating it as such. It has more intelligent algorithms for sensing differences between header/body/footnote/footer.
When it then exports to other formats (DOCX, HTML, EPUB), it then tries to properly mark the footnotes as actual footnotes.
Quote:
Originally Posted by Hitch
What you have now, you have to go through and recreate the footnotes, in Word, using the automated functionality, one-by-one. OR, you could open it up in HTML and code them--one by one. But those are pretty much your two choices.
|
There's an HTML method I mentioned in passing over the years:
A superscript number being the first thing in a paragraph "is most likely a footnote".
This allows you to markup something like:
Code:
<p><sup>123</sup> Example footnote.</p>
as:
Code:
<p class="footnote"><sup>123</sup> Example footnote.</p>
so if this is your original:
Code:
<p>This is a sent-</p>
<p class="footnote"><sup>123</sup> Example footnote.</p>
<p class="footnote"><sup>124</sup> Another example footnote.</p>
<p>ence that gets split across pages.</p>
you can rip out all the "footnote" classes and place at the end of the HTML:
Code:
<p>This is a sent-</p>
<p>ence that gets split across pages.</p>
[...]
<p class="footnote"><sup>123</sup> Example footnote.</p>
<p class="footnote"><sup>124</sup> Another example footnote.</p>
</body>
</html>
The downfall is it can't handle more complicated multi-paragraph footnotes*, or footnotes that go across pages, etc., but it can get you most of the way there.
I've tested this method across tons of books, and it works, but it requires some initial massaging of the HTML.
* Note: If the multi-paragraph footnotes are also smaller text, and nothing else in the book is, this can also be used to mark paragraphs as "footnote" class. From what droopy said in Post #7, it seems like this may be the case for this specific book.