Quote:
Originally Posted by shawn
|
Actually, after taking a look at the page, you might want to break up the book so that you have a query per section--web2book doesn't currently allow content extraction patterns to apply to followed links (AFAIK, geekraver, please correct me if I'm wrong).
All the chapters that belong to a particular section are on the same page anyways--so setting a content extraction pattern for the TOC and following links to depth of 2 would result in a lot of duplicated content.
The prefaces/introduction are all on one page, each part/section is on one page, the conclusion is on one page, and all the appendices are on one page--so you'll end up with 11 entries, with a link depth of 1 and a fairly simple regex..this worked for part 1 and will probably work for the other chapters:
Code:
(<h2>.*<!--endofchap-->)
I tried publishing this for you to just subscribe to, but publishing doesn't seem to be working ATM.