Agree with theducks, you shouldn't expect perfect conversions from Calibre when using heuristics. While there is certainly lots of room for improvement with heuristics, the primary goal of the feature is not to perfectly guess the document structure for any poorly formatted document - that's not really an achievable goal.
Before heuristics was added Calibre would seemingly arbitrarily split documents at every 260KB to guarantee that the document was compatible with Adobe based readers and other memory limited devices. Fixing these arbitrary split points then involved a ton of work in Sigil or some other app - many files need to be manually merged and then re-split. The primary goal of heuristics therefore wasn't to perfectly detect every chapter but to eliminate all that manual labor by guessing 'good' split points so that the post conversion cleanup effort is minimal. For many files heuristics will guess every split point correctly, but for many others it will make a few mistakes - this should be expected.
If you really want more control from the very beginning then learning markdown as Agama suggested earlier is your best bet.
btw, the behavior you see in that most recent conversion are all expected limitations in the current heuristics functionality. 'foreword' isn't in the dictionary based approach that heuristics uses, and inline text TOCs trip heuristics up - the last item will often get detected as a chapter.
Last edited by ldolse; 12-30-2011 at 02:55 PM.
|