View Single Post
Old 04-15-2020, 09:11 PM   #16
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by droopy View Post
Is there an automated way to get the footnotes "properly constructed" (not actually sure what that means)?
No, no automated way... but I may be able to make it work.

Could you PM me a link to the PDF+DOCX, and I'll see what I can do.

If the file is as you say, and all the footnotes have "smaller superscript numbers" in their proper locations, I have a method which should be able to convert the majority of footnotes.

Quote:
Originally Posted by Hitch View Post
If you have a PDF with footnotes, etc., you are almost always better off going the AbbyyFineReader route.
Agreed.

Finereader usually does a pretty good job at recognizing what's a footnote, and actually treating it as such. It has more intelligent algorithms for sensing differences between header/body/footnote/footer.

When it then exports to other formats (DOCX, HTML, EPUB), it then tries to properly mark the footnotes as actual footnotes.

Quote:
Originally Posted by Hitch View Post
What you have now, you have to go through and recreate the footnotes, in Word, using the automated functionality, one-by-one. OR, you could open it up in HTML and code them--one by one. But those are pretty much your two choices.
There's an HTML method I mentioned in passing over the years:

A superscript number being the first thing in a paragraph "is most likely a footnote".

This allows you to markup something like:

Code:
<p><sup>123</sup> Example footnote.</p>
as:

Code:
<p class="footnote"><sup>123</sup> Example footnote.</p>
so if this is your original:

Code:
<p>This is a sent-</p>

<p class="footnote"><sup>123</sup> Example footnote.</p>

<p class="footnote"><sup>124</sup> Another example footnote.</p>

<p>ence that gets split across pages.</p>
you can rip out all the "footnote" classes and place at the end of the HTML:

Code:
<p>This is a sent-</p>

<p>ence that gets split across pages.</p>

[...]

<p class="footnote"><sup>123</sup> Example footnote.</p>
<p class="footnote"><sup>124</sup> Another example footnote.</p>
</body>
</html>
The downfall is it can't handle more complicated multi-paragraph footnotes*, or footnotes that go across pages, etc., but it can get you most of the way there.

I've tested this method across tons of books, and it works, but it requires some initial massaging of the HTML.

* Note: If the multi-paragraph footnotes are also smaller text, and nothing else in the book is, this can also be used to mark paragraphs as "footnote" class. From what droopy said in Post #7, it seems like this may be the case for this specific book.

Last edited by Tex2002ans; 04-15-2020 at 09:20 PM.
Tex2002ans is offline   Reply With Quote