View Single Post
Old 05-08-2009, 03:39 AM   #25
tlc
Zealot
tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!tlc is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!
 
Posts: 142
Karma: 50288
Join Date: Feb 2009
Device: KK 3G, iPad
My interest is just getting better reflowable paragraphs on fiction. I tried cxpdfhtml.py on a novel and was surprised at how well the "break on short lines" approach worked, although I haven't read in depth to find the not-short-enough lines.

I was wondering if you are considering (or anyone else has implemented) detection of paragraphs based on indentation?
tlc is offline   Reply With Quote