MobileRead Forums - View Single Post

cybmole · 01-15-2011, 03:31 AM

we've been around that loop in other threads, structure detection doesn't stand a chance if the chapter starts have the same html tags as the rest of the book. I can pick them out manually by reasoning that, say, a line with only a roman numeral is a chapter start, or a line that begins with CHAPTER in caps, but I can't add those constructs to structure detect easily.

we could test: I should be able to find a non-patched up book in my collection, which has chapters of the form CHAPTER... , inside of simple <p tags.
give me the xpath expression for structure detect of that on epub source, please.
when I try making them with the wizard & tweaking them, they either find far too much, or multiple entire nothing at all. I already figured out that the "i" bit sets case insensitive , so I remove that. and I add class = bold if that's the chapter header style, but still no joy.

01-15-2011, 03:31 AM	#5
cybmole Wizard Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none	we've been around that loop in other threads, structure detection doesn't stand a chance if the chapter starts have the same html tags as the rest of the book. I can pick them out manually by reasoning that, say, a line with only a roman numeral is a chapter start, or a line that begins with CHAPTER in caps, but I can't add those constructs to structure detect easily. we could test: I should be able to find a non-patched up book in my collection, which has chapters of the form CHAPTER... , inside of simple <p tags. give me the xpath expression for structure detect of that on epub source, please. when I try making them with the wizard & tweaking them, they either find far too much, or multiple entire nothing at all. I already figured out that the "i" bit sets case insensitive , so I remove that. and I add class = bold if that's the chapter header style, but still no joy. Last edited by cybmole; 01-15-2011 at 03:37 AM.