Quote:
Originally Posted by ldolse
Heuristics can't catch all scenarios, there are infinite possibilities for the way chapters can be formatted. In your case the word 'chapter' is explicitly covered, so my guess is it might be related to the underlying html code.
Tangential to Agama's solution would be to try to cleanse the html by converting to text with markdown or textile, and then convert back from text to ePub/Mobi with heuristics enabled. In some cases this will eliminate whatever html formatting tripped up heuristics, but basic formatting like italics/bold would be retained.
|
I think you might be misunderstanding me. The file that I am converting is plain vanilla text - no html.
And while I get your statement about heuristics being difficult, I would have said that a line starting with the word 'Chapter' followed by a number (admittedly in Roman numerals) was a good candidate for a chapter break