|
|
Thread Tools | Search this Thread |
02-07-2020, 07:53 AM | #16 | |||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
* * * What's the name of this book? Perhaps there's a better quality file floating around. Many times it's better to work from a much better source than to try to clean up a poorly-converted mess. Quote:
Look closely at the Footnote code from Example #1: (I'm going to rip out the class="" crap so we can see Code:
<a href="../Text/index_split_000.xhtml#anchor22">14</a> [...] <p><span id="anchor129"></span><a href="../Text/index_split_000.xhtml#anchor60">14</a> I'm assuming the entire book's IDs are completely mangled as well. (They don't match up in any of the examples you gave.) Quote:
It may be easier to rename all the files to human-readable names. In Sigil, on the left-side, you'll see the Book Browser (where it lists all the files). You can Right-Click files, then Rename. So then you could rename: index_split_003.xhtml -> Preface.xhtml index_split_005.xhtml -> Part1.xhtml [...] That will at least make the spaghetti of links more readable. Then you would be able to more easily tell: "Whoops, this Chapter 1 footnote actually points to titlepage.xhtml... no wonder it's broken." And while this ebook may be recoverable... perhaps this cleanup project may be too advanced for someone who doesn't know the technical innards yet. Last edited by Tex2002ans; 02-07-2020 at 07:56 AM. |
|||
02-07-2020, 04:24 PM | #17 |
Resident Curmudgeon
Posts: 73,983
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
This mess (to me) is not worth the effort. It would be a heck of a lot easier to get a tablet and read the PDF that way.
|
Advert | |
|
02-11-2020, 07:44 PM | #18 | |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Sorry for my delay in getting back to you. I thought I was getting instant email notifications when there are new responses to this thread.
Quote:
* * * Reformed Dogmatics by Herman Bavinck. (This is the four-volume work, not the "Concise Reformed Dogmatics", which is a one-volume "summary"). |
|
02-11-2020, 07:53 PM | #19 | |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Quote:
Good call. I've uploaded chapter one (aka index_split_006.xhtml). I had to add an acceptable suffix of "zip" for the forum to allow the upload. When you download it simply remove the ".zip" part . The file really ends in ".xhtml". Again, please know this isn't really a zipped/archive file. |
|
02-11-2020, 08:11 PM | #20 | |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Quote:
If it's going to be too hard to fix this mangled epub, I guess I can just ignore the superscripts and just read the notes at the end of the chapter. I'll try to remember to not tap on the links! |
|
Advert | |
|
02-11-2020, 08:59 PM | #21 |
Guru
Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
So the main problem is that in the footnote link "anchor9" should read "anchor116", "anchor10" should be "anchor117" etc. We need to locate each footnote reference and add 107 to the numeric part.
Sounds trivial. But, unfortunately, RegEx doesn't do arithmetic, at least not in any straightforward way. An interesting challenge! |
02-11-2020, 10:43 PM | #22 |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Hi wombat,
When you say "main problem", am I understanding you correctly to believe that solving this "main problem" doesn't completely fix the links? |
02-12-2020, 04:10 AM | #23 | |
null operator (he/him)
Posts: 20,570
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
BR |
|
02-12-2020, 07:00 AM | #24 |
Guru
Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
Well, the reference to split_000 would have to go too. But that's easy. I was looking for PROBLEMS!
It would be neat to just do the arithmetic. And @BetterRed's suggestion looks very promising. But there's another approach. The footnote numbering - the 1, 2 etc. visible to the reader - is correct, and is consistently styled. It should be quite possible to use RegEx to strip all the existing href stuff then to create new links deriving anchor names from the footnote numbers. Last edited by exaltedwombat; 02-12-2020 at 07:03 AM. |
02-12-2020, 07:33 AM | #25 |
Unicycle Daredevil
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
In the time everybody here has invested in thinking about possible solutions, the OP could have fixed all the links manually. Just sayin' (he said and headed over to the unendurable phrases thread...
|
02-12-2020, 07:41 AM | #26 |
Guru
Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
Not really. Partly because there seem to be a LOT of them, partly because he didn't know how. Though I take your point in general, no point in developing automation for a one-off problem that COULD be fixed manually. Unless you have an enquiring mind and enjoy problem-solving. (And if you don't, I pity you!)
|
02-12-2020, 08:21 AM | #27 |
Unicycle Daredevil
Posts: 13,923
Karma: 185041098
Join Date: Jan 2011
Location: Planet of the Pudding Brains
Device: Aura HD (R.I.P. After six years the USB socket died.) tolino shine 3
|
You're absolutely right. I wasn't being completely serious. But in this case, I had a look at the mess, and when I saw that the anchors didn't even match, I thought it was a lost case for automation.
But kudos if you manage to come up with a solution! |
02-12-2020, 04:02 PM | #28 | |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Quote:
Thanks for your reply. After turning on Function Mode, what do I type in the Search Field and the Replace field? Last edited by thymesnewroman; 02-12-2020 at 04:32 PM. |
|
02-12-2020, 04:47 PM | #29 | |
Enthusiast
Posts: 30
Karma: 2077248
Join Date: Feb 2020
Device: Forma
|
Quote:
Are you saying that using the Function mode in Calbre editor isn't enough to solve this problem? |
|
02-12-2020, 06:02 PM | #30 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Thanks for the attachment.
I say go full nuclear. Wipe out all the crap links, substitute with good ones. (I tested this on your "index_split_006.xhtml" and it worked perfectly on all footnotes in that chapter.) Full Nuclear Option (For This Book) 1. In Sigil, press Ctrl+F to pop up the Find/Replace. Make sure you set Mode: Regex. 2. Once you have it set to Regex... These will wipe out all the busted links and substitute them with the number in brackets: Search: <a class="[^"]+" href="[^"]+"><span class="[^"]+">(\d+)</span></a> Replace: <a href="#fn\1" id="ft\1">[\1]</a> and then these 3 will wipe out the broken anchors and fix the Footnotes: Search: (<p class="[^"]+">)<span class="[^"]+" id="[^"]+"></span> Replace: \1 Search: (<p class="[^"]+">)<a class="[^"]+" href="[^"]+"><span class="[^"]+"></span></a> Replace: \1 Search: (<p class="[^"]+">)<a href="#fn[0-9]+" id="ft[0-9]+">\[([0-9]+)\]</a> Replace: \1<a href="#ft\2" id="fn\2">[\2]</a> (Completely Optional) If you want to throw away the other anchor code: Search: <a class="[^"]+" href="[^"]+"><span class="[^"]+"></span></a> Replace: ***[Blank] PUT NOTHING*** 3. After you're done, make sure to go back to Case Sensitive Mode. Last edited by Tex2002ans; 02-12-2020 at 06:17 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
4 issues: title length, title renaming, image resizing, link to single post | avid01 | Feedback | 9 | 09-10-2016 09:51 PM |
Title page showing filename and not true book title... | hikerguy | Editor | 3 | 03-30-2015 05:37 PM |
API to link local file to a ebook page | guru_r236 | Development | 1 | 05-22-2011 09:00 AM |
How to disable conversion date in title and source link in page? | siebert | Recipes | 18 | 11-26-2010 12:57 AM |
How do I fix the title on a personal content Kindle book? | ensyed | Amazon Kindle | 5 | 02-27-2010 03:45 AM |