Replacing TOC containing only page #s ("Edit TOC")
After using the above auto+manual technique in Calibre to find those page-structured ePubs in my ePub library that contained only page #s in the TOC, I generally was not able to find an alternative ePub version that instead had the more typical continuous-flow text (not-page-structured) with only paragraph and chapter breaks and a corresponding TOC.
So, instead I left the page-structure as-is (I've removed page structures before and it can be a tedious process) and focused instead on removing the page#-only TOC and replacing it with a content-meaningful TOC.
Surprisingly, the following "Edit TOC" script was useful as a starting point in almost every case when the ePub was page-structured, then manually deleting TOC elements that did not belong and manually inserting others as identified by the ePub's own internal TOC:
//h:td[re:test(., "(^\s*[0-9]{1,2}\s*\n\s*[A-Za-z0-9].{1,80}[a-z]\n*)|(^\s*[0-9]{1,2}\s*$)|(^.{1,80}[a-z]\s*$)|(^\s*[IVX]{1,6}\s*$)|(^\s*prologue)|(^\s*epilogue)|(^\s*chap ter)|(^\s*book\s)|(^\s*part\s)|(^\s*map)|(^\s*inde x)|(^\s*introduction)|(^\s*notes)", "i")]
|