MobileRead Forums - View Single Post

mike_bike_kite · 07-28-2010, 12:26 PM

I can't understand why there aren't simple post processors to process the text. Take the text output and join the lines together unless they end in a full stop, a question mark or a double quote.

You may need to remove page numbers if present and any chapter titles that appear at the top of each page. I managed to get this far but then found there were various funny characters in the text to represent double ll's etc and these need to be converted.

My aim was to finally generate HTML and then use the chapter titles to create a TOC. I got halfway there

but considering how clever tools like Calibre are, it surprised me that this wasn't done automatically.

07-28-2010, 12:26 PM	#14
mike_bike_kite Digitally confused Posts: 500 Karma: 1500000 Join Date: Mar 2010 Location: London, UK Device: KPW, K2i, Nexus 7 32gb, Kobo Mini	I can't understand why there aren't simple post processors to process the text. Take the text output and join the lines together unless they end in a full stop, a question mark or a double quote. You may need to remove page numbers if present and any chapter titles that appear at the top of each page. I managed to get this far but then found there were various funny characters in the text to represent double ll's etc and these need to be converted. My aim was to finally generate HTML and then use the chapter titles to create a TOC. I got halfway there but considering how clever tools like Calibre are, it surprised me that this wasn't done automatically.