View Single Post
Old 10-05-2010, 01:21 PM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
If you're working from ASCII text you should try some of the different text input options. Preprocessing won't work unless you first choose the right text input option. Text Input defaults to assuming hard line breaks with an empty space between paragraphs.

It sounds like you need to enable the "Treat each line as a paragraph" option under text input.

There are multiple stages in Calibre's conversion pipeline. Preprocessing is a very early stage, and it just does some additional reformatting of the doc. Chapter detection happens at a later stage in the conversion pipeline where Calibre has created and expects well formed xhtml, and it's at this point you need to use Xpath.

Xpath works great without any preprocessing when you have well formed html content, and there is plenty of well formed content out there. ASCII txt is not an example of well formed content...

Last edited by ldolse; 10-05-2010 at 01:26 PM.
ldolse is offline   Reply With Quote