MobileRead Forums - View Single Post - How to force TOC generation out of scanned PDF

magphil · 09-30-2009, 07:51 AM

Hi,

I am trying to convert a scanned pdf document to mobi.

I cannot get any TOC, even though there are 133 chapter labled

CHAPTER NNN

The default regexp used for chapter detection is :

//*[((name()='h1' or name()='h2') and re:test(.,'chapter|book|section|part\s+', 'i')) or @class = 'chapter']

I guess it expects tags named h1 or h2 or those defined in class chapter.

How can we get a TOC when there is no tags, but chapter keyword is part of the text ?

Thanks for any hint

magphil

09-30-2009, 07:51 AM	#1
magphil Connoisseur Posts: 60 Karma: 5090 Join Date: Jun 2009 Device: Gen3, Kobo glow	How to force TOC generation out of scanned PDF Hi, I am trying to convert a scanned pdf document to mobi. I cannot get any TOC, even though there are 133 chapter labled CHAPTER NNN The default regexp used for chapter detection is : //*[((name()='h1' or name()='h2') and re:test(.,'chapter\|book\|section\|part\s+', 'i')) or @class = 'chapter'] I guess it expects tags named h1 or h2 or those defined in class chapter. How can we get a TOC when there is no tags, but chapter keyword is part of the text ? Thanks for any hint magphil