Quote:
Originally Posted by compurandom
I am a programmer and this has been on my todo list for a number of years.
Too many projects right now to get to it, but here's my thoughts...
Calibre has an internal pdf library that it should be easy to leverage to do this.
Not all pdf files with a TOC actually have a TOC. Far more than I like have a "TOC" with a single entry in them, so knowing if it has a TOC is not enough, you need to check if it has more than some threshold number of entries. (Which is still trivial, but...)
|
Yes, I agree. I am currently using a script outside of Calibre in Thunar that marks PDF files with TOC. I upload them to the Calibre library, and they are tagged as "TOC" or "bezTOC". As you mentioned, not every TOC is useful—sometimes it's just a one entry, sometimes it's just a mess. But even that mess can sometimes help determine whether a PDF document is a single book or several booklets combined.
I'm not a programmer; I write my scripts with the help of AI, and it also translates this text for me. 🙂