Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 10-12-2020, 05:37 PM   #1
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Question Quality Check: TOC Contains 'Any' Page #s

Seeking suggestions for how to set up a Calibre "Quality Check" library search to find all books that contain ANY page #s in the table of contents, since I am trying to isolate those poorly converted (typically old) books that essentially contain only page #s. The objective is to find them so I can try to find alternative versions that are structured without page #s and show actual chapter sections in the TOC. I tried the approach of searching for a single occurrence of "page_[0-9]|page [0-9]|page[0-9]" in the NCX TOC but that approach doesn't work and on a sub-sample run I get too many hits, none of which seem to show page #s in the TOC.

Thanks in advance for any help on this.
Rob557 is offline   Reply With Quote
Old 10-12-2020, 07:50 PM   #2
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
approach that seems to work

I tried again, and the following modified "Quality Check" search setting DOES seems to work fairly well:
\>page_[0-9]|\>page [0-9]|\>page[0-9]
There are significantly fewer hits, most of which do in fact show at least one page # in the TOC. I can then use "Edit ToC" for each of those ePubs to look for books where the TOC contains just page #s and no chapter identification.
Rob557 is offline   Reply With Quote
Advert
Old 10-16-2020, 03:45 PM   #3
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Replacing TOC containing only page #s ("Edit TOC")

After using the above auto+manual technique in Calibre to find those page-structured ePubs in my ePub library that contained only page #s in the TOC, I generally was not able to find an alternative ePub version that instead had the more typical continuous-flow text (not-page-structured) with only paragraph and chapter breaks and a corresponding TOC.

So, instead I left the page-structure as-is (I've removed page structures before and it can be a tedious process) and focused instead on removing the page#-only TOC and replacing it with a content-meaningful TOC.

Surprisingly, the following "Edit TOC" script was useful as a starting point in almost every case when the ePub was page-structured, then manually deleting TOC elements that did not belong and manually inserting others as identified by the ePub's own internal TOC:

//h:td[re:test(., "(^\s*[0-9]{1,2}\s*\n\s*[A-Za-z0-9].{1,80}[a-z]\n*)|(^\s*[0-9]{1,2}\s*$)|(^.{1,80}[a-z]\s*$)|(^\s*[IVX]{1,6}\s*$)|(^\s*prologue)|(^\s*epilogue)|(^\s*chap ter)|(^\s*book\s)|(^\s*part\s)|(^\s*map)|(^\s*inde x)|(^\s*introduction)|(^\s*notes)", "i")]
Rob557 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Quality Check kiwidude Plugins 1251 07-07-2025 09:13 PM
Share button greyed-out; used Quality Check timetravelprimer Kindle Formats 3 02-02-2018 06:02 PM
'Find Next' with Quality Check plugin Frizzell Library Management 1 10-31-2017 06:23 AM
Touch Quality check before loading to KT? GvilleBridge Kobo Reader 7 07-11-2012 07:55 PM
Quality check some extra function drMerry Development 4 05-28-2011 12:40 PM


All times are GMT -4. The time now is 01:01 PM.


MobileRead.com is a privately owned, operated and funded community.