View Single Post
Old 09-30-2009, 07:51 AM   #1
magphil
Connoisseur
magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!magphil , Klaatu Barada Niktu!
 
Posts: 60
Karma: 5090
Join Date: Jun 2009
Device: Gen3, Kobo glow
How to force TOC generation out of scanned PDF

Hi,

I am trying to convert a scanned pdf document to mobi.

I cannot get any TOC, even though there are 133 chapter labled

CHAPTER NNN

The default regexp used for chapter detection is :

//*[((name()='h1' or name()='h2') and re:test(.,'chapter|book|section|part\s+', 'i')) or @class = 'chapter']

I guess it expects tags named h1 or h2 or those defined in class chapter.

How can we get a TOC when there is no tags, but chapter keyword is part of the text ?

Thanks for any hint

magphil
magphil is offline   Reply With Quote