Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 04-15-2020, 09:11 PM   #16
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by droopy View Post
Is there an automated way to get the footnotes "properly constructed" (not actually sure what that means)?
No, no automated way... but I may be able to make it work.

Could you PM me a link to the PDF+DOCX, and I'll see what I can do.

If the file is as you say, and all the footnotes have "smaller superscript numbers" in their proper locations, I have a method which should be able to convert the majority of footnotes.

Quote:
Originally Posted by Hitch View Post
If you have a PDF with footnotes, etc., you are almost always better off going the AbbyyFineReader route.
Agreed.

Finereader usually does a pretty good job at recognizing what's a footnote, and actually treating it as such. It has more intelligent algorithms for sensing differences between header/body/footnote/footer.

When it then exports to other formats (DOCX, HTML, EPUB), it then tries to properly mark the footnotes as actual footnotes.

Quote:
Originally Posted by Hitch View Post
What you have now, you have to go through and recreate the footnotes, in Word, using the automated functionality, one-by-one. OR, you could open it up in HTML and code them--one by one. But those are pretty much your two choices.
There's an HTML method I mentioned in passing over the years:

A superscript number being the first thing in a paragraph "is most likely a footnote".

This allows you to markup something like:

Code:
<p><sup>123</sup> Example footnote.</p>
as:

Code:
<p class="footnote"><sup>123</sup> Example footnote.</p>
so if this is your original:

Code:
<p>This is a sent-</p>

<p class="footnote"><sup>123</sup> Example footnote.</p>

<p class="footnote"><sup>124</sup> Another example footnote.</p>

<p>ence that gets split across pages.</p>
you can rip out all the "footnote" classes and place at the end of the HTML:

Code:
<p>This is a sent-</p>

<p>ence that gets split across pages.</p>

[...]

<p class="footnote"><sup>123</sup> Example footnote.</p>
<p class="footnote"><sup>124</sup> Another example footnote.</p>
</body>
</html>
The downfall is it can't handle more complicated multi-paragraph footnotes*, or footnotes that go across pages, etc., but it can get you most of the way there.

I've tested this method across tons of books, and it works, but it requires some initial massaging of the HTML.

* Note: If the multi-paragraph footnotes are also smaller text, and nothing else in the book is, this can also be used to mark paragraphs as "footnote" class. From what droopy said in Post #7, it seems like this may be the case for this specific book.

Last edited by Tex2002ans; 04-15-2020 at 09:20 PM.
Tex2002ans is offline   Reply With Quote
Old 04-16-2020, 12:49 PM   #17
droopy
Guru
droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.
 
Posts: 833
Karma: 2912460
Join Date: Apr 2009
Device: Kobo Forma
Hi Tex,
PM sent.
droopy is offline   Reply With Quote
Advert
Old 04-16-2020, 02:46 PM   #18
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,461
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by Tex2002ans View Post
No, no automated way... but I may be able to make it work.

Could you PM me a link to the PDF+DOCX, and I'll see what I can do.
Are you BORED, snookums?

Hitch
Hitch is offline   Reply With Quote
Old 04-16-2020, 09:31 PM   #19
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Hitch View Post
Are you BORED, snookums?
Just seeing if the theory plays out in other materials.

Could take a teensy weensy break from the madness to write up another beast.

You know I'm stuck over here in my little Economics/History/Non-Fiction bubble, and I love my footnotes! Once I see that word, I begin foaming at the mouth!

Quote:
Originally Posted by droopy View Post
Hi Tex,
PM sent.
Thanks. I quickly scanned through droopy's 3 PDFs.

The PDFs don't actually have superscript footnotes.

The actual text uses the form:

Code:
Example sentence.<sup>1</sup>
but the footnotes at the bottom then use:

Code:
1. Example footnote.
separated by a blank gap between body-text/footnotes.

And like I said earlier, Finereader does an okay job at detecting differences between body-text/footnotes. In this specific case, it detected most footnotes okay (definitely looks better than Word's PDF Import in that regard).

* * *

And here is ~ the rest of the PM I sent droopy:

I generated 3 types of files:

1. [Finereader] - This is a DOCX generated straight from Finereader.

2. [Toxaris] - This is the [Finereader] DOCX, which I ran through Toxaris's fantastic "EPUB Tools".

Note: It tries its best to clean up a bunch of Finereader's hidden junk, and do some basic cleanup like combine broken paragraphs together, etc.

The text with red highlights is paragraphs that could be broken/merged incorrectly, so you can more closely look at them and fix manually if needed.

3. EPUB - This was generated straight from EPUB Tools using the [Toxaris] DOCX.

Because this was all OCRed (and PDF sucks + the source files weren't the greatest), there ARE going to be the usual OCR issues creeping in there:
  • Text may be wrong (OCR is "99.9% accurate")
    • Some of these scans weren't the greatest either (crooked, still see page edges, etc.), so this introduces more error.
  • Formatting may be wrong
    • Italics missing, headings may not be headings, etc.
  • While many footnotes were detected properly, many weren't.
    • On top of that, the problem with PDF->DOCX "automated footnotes" is... the numbers may now be thrown way off. If 1-4 + 6-10 were detected fine... Word will only think there are "9 actual footnotes". 5 will be floating in the text, and 6-10 will now be off by 1.

So it's up to you... you could:
  • Do your cleanup in the EPUB
  • or work through that [Toxaris] DOCX and try to do your cleanup directly in Word.

But as has been discussed on MobileRead many, many times... PDFs are awful as input formats.

If you want perfectly clean ebooks, you would have to get in there and do all the manual corrections, there just ain't no way around it.

Last edited by Tex2002ans; 04-16-2020 at 09:54 PM.
Tex2002ans is offline   Reply With Quote
Old 04-27-2020, 04:15 PM   #20
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
I now see "wanna" used so frequently that I am beginning to suspect that some people actual believe this is standard English usage. Please tell me that it doesn't appear in Webster's Collegiate 11th edition? Must I pay $18.33 to find that the language has degraded so much since the 10th edition?
Notjohn is offline   Reply With Quote
Advert
Old 04-27-2020, 05:10 PM   #21
droopy
Guru
droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.
 
Posts: 833
Karma: 2912460
Join Date: Apr 2009
Device: Kobo Forma
Quote:
Originally Posted by Notjohn View Post
I now see "wanna" used so frequently that I am beginning to suspect that some people actual believe this is standard English usage. Please tell me that it doesn't appear in Webster's Collegiate 11th edition? Must I pay $18.33 to find that the language has degraded so much since the 10th edition?
I do it simply to save on title space. There's a character limit with some forums.

wanna = 5 characters
want to = 7 characters
droopy is offline   Reply With Quote
Old 04-27-2020, 05:16 PM   #22
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,931
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by droopy View Post
I do it simply to save on title space. There's a character limit with some forums.

wanna = 5 characters
want to = 7 characters
You type in the topic as you want it to be and only if it doesn't fit do you shorten it. But please when you do shorten a topic, don't do it grammatically incorrectly.
JSWolf is offline   Reply With Quote
Old 04-27-2020, 06:09 PM   #23
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,565
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Talking And if someone doesn't, what're you gunna do about it Jon

Quote:
Originally Posted by JSWolf View Post
You type in the topic as you want it to be and only if it doesn't fit do you shorten it. But please when you do shorten a topic, don't do it grammatically incorrectly.
I see nothing wrong with adding a bit of levity by using inoffensive cant in topic titles, as I am sometimes wont to do, especially in trying times.

Spelling errors are a different matter, especially missing apostrophes

BR
BetterRed is offline   Reply With Quote
Old 04-27-2020, 06:22 PM   #24
droopy
Guru
droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.droopy ought to be getting tired of karma fortunes by now.
 
Posts: 833
Karma: 2912460
Join Date: Apr 2009
Device: Kobo Forma
Hi BR. LEMME fix YER title FER YA:

Quote:
And if someone doesn't, WHATCHA gunna do about it Jon

Last edited by droopy; 04-27-2020 at 06:26 PM.
droopy is offline   Reply With Quote
Old 04-27-2020, 07:17 PM   #25
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,461
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by JSWolf View Post
You type in the topic as you want it to be and only if it doesn't fit do you shorten it. But please when you do shorten a topic, don't do it grammatically incorrectly.
Oh, hell, Jon:

you've seen me use it dozens of times, when I'm typing obvious slang. Everybody here that has ever read more than 2 posts from me knows full well I can deploy the Queen's English at will. Using Wanna, woulda, coulda...nobody will die and nobody's bits and pieces will fall off of their you-knows.

Good God, man, you act as though we're using Textspeak. Rnt U gld that wRt not?

Hitch
Hitch is offline   Reply With Quote
Old 04-29-2020, 04:45 PM   #26
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
Yer rite, Hitch. Less all spel the way we wanna.
Notjohn is offline   Reply With Quote
Old 04-30-2020, 10:21 AM   #27
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,791
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
There is a difference between deliberate (dialect?) spelling and careless spelling / word use errors .

Would Jon correct all those great music lyrics?
Code:
Whacha gonna do when the man comes for you?
theducks is offline   Reply With Quote
Old 04-30-2020, 10:57 AM   #28
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,461
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by theducks View Post
There is a difference between deliberate (dialect?) spelling and careless spelling / word use errors .

Would Jon correct all those great music lyrics?
Code:
Whacha gonna do when the man comes for you?
You know, it's funny you bring this up...dialect is one of my hot buttons, in fiction. Drives me NUTS, especially (and I think this was the original trigger for me), Scottish. So help me, if I read one more "och aye, lassie" or "dinna ken," I'll scream.

I mean, they're fine, sprinkled throughout for "flavor," right? But when a character's entire dialogue is phonetically rendered, line after line after line...AGGGGH!!

I doesn't seem to matter if it's Low Country, "Texan" (don't get me started), Irish, Scots, Russian...it's just grinding. Yes, I know, there are good-selling books that have this in them; but there are bestsellers with idiotic crap like sparkly vampires and "heroines" with the character depth of a piece of paper, too. No accounting for tastes. But once a character has been drawn, and we've "heard" his voice in our heads, OMG, give it a REST! To me, doing every single line of dialogue phonetically is the writing equivalent of exposition--telling, not showing.

If your writing is so pathetic that you have to remind me, line after line, of how your character sounds, then maybe you need to go back to the drawing board and create a character that comes alive for us, ya know?

Hitch
Hitch is offline   Reply With Quote
Old 04-30-2020, 01:24 PM   #29
najgori
Klak
najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'najgori gives new meaning to the word 'superlative.'
 
najgori's Avatar
 
Posts: 174
Karma: 150374
Join Date: Sep 2011
Location: Belgrade, Serbia
Device: many
Quote:
Originally Posted by Hitch View Post
You know, it's funny you bring this up...dialect is one of my hot buttons, in fiction. Drives me NUTS, especially (and I think this was the original trigger for me), Scottish. So help me, if I read one more "och aye, lassie" or "dinna ken," I'll scream.

I mean, they're fine, sprinkled throughout for "flavor," right? But when a character's entire dialogue is phonetically rendered, line after line after line...AGGGGH!!
so, no trainspotting for ye?
najgori is offline   Reply With Quote
Old 04-30-2020, 01:43 PM   #30
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,278
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by JSWolf View Post
You type in the topic as you want it to be and only if it doesn't fit do you shorten it. But please when you do shorten a topic, don't do it grammatically incorrectly.
That's hilarious, since you continue to consistently use then when than would be correct.
j.p.s is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
In Docx to ePub Conversion window, wanna add Kobo Forma in the "Output Profile" list droopy Conversion 3 04-08-2020 05:37 PM
How to turn all images in Kobo Forma to grayscale (to save space & speed xp) droopy Devices 20 10-27-2019 10:16 PM
How to turn an ePub/InteractivePDF/Docx file into a standalone eBook APP? danrojest ePub 13 01-12-2017 09:13 AM
Drawbacks with Pop Up Footnotes in epub 3 ? verydeepwater ePub 8 06-13-2014 05:28 AM
How do I make either end notes of footnotes in epub? ghostyjack ePub 69 11-01-2010 01:26 PM


All times are GMT -4. The time now is 04:02 AM.


MobileRead.com is a privately owned, operated and funded community.