MobileRead Forums - View Single Post

ldolse · 10-16-2010, 01:22 PM

I'm terrible with xpath, but I have a hunch you're screwed trying to search for text just free floating throughout the book in the body tag.

You're best bet is to take the html from debug info and do a find replace in a text editor with regex search/replace support.

Search for this:

Code:

(Chapter\s+\d+)

and replace it with this:

Code:

<h2>\1</h2>

Depending on the editor you use it might be $(1), or $1, or whatever instead of '\1' as I used above - check the documentation for your editor.

Then import the edited html file to Calibre, and have Calibre convert using the zipped html source instead of the pdf. Calibre's default chapter detection xpath will automatically pick the chapters up if your search and replace properly wrapped the html in <h2> tags.

10-16-2010, 01:22 PM	#4
ldolse Wizard Posts: 1,337 Karma: 123457 Join Date: Apr 2009 Location: Malaysia Device: PRS-650, iPhone	I'm terrible with xpath, but I have a hunch you're screwed trying to search for text just free floating throughout the book in the body tag. You're best bet is to take the html from debug info and do a find replace in a text editor with regex search/replace support. Search for this: Code: (Chapter\s+\d+) and replace it with this: Code: <h2>\1</h2> Depending on the editor you use it might be $(1), or $1, or whatever instead of '\1' as I used above - check the documentation for your editor. Then import the edited html file to Calibre, and have Calibre convert using the zipped html source instead of the pdf. Calibre's default chapter detection xpath will automatically pick the chapters up if your search and replace properly wrapped the html in <h2> tags.