View Single Post
Old 08-29-2009, 11:28 AM   #3
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Thanks, nrapallo!

I'll take a look at your crash report and see what's going on. I'm getting semi-regular crashes (always on certain files) myself--crashes that I suspect could be fixed by a bit more thorough/clever preprocessing of files.

And yes, I'll make an .exe of it for the next upload.

This script is a step short of my crazy (and formerly aired) idea of turning text files into databases with walkable nodes representing all words, sentences, paragraphs, et al.

The next things I am going to try to get working are 1) part/chapter/section title detection and 2) poetry/quotation detection.

I see both of those working either in an overzealous automatic mode (assumes anything that might be a title or a quotation *is* one, and the user/bookmaker will restore formatting if it isn't) and an interactive one, where python informs the user of the match it thinks it found, and lets the user instruct it how to handle said potential match.

e.g.:

Code:
Potential title match:

-2:        and so she left.
-1:        
 0:        III. On the way to Istanbul
 1:        
 2:        The friar did not hesitate to purchase a ticket on the next ship, perhaps because

   Encode line 0 as [P]art/H1, [C]hapter/H2, [S]ection/H3 or [I]gnore?
   Enter  choice: _
I'm also hoping I'll get around to tidying up the code a bit at some point...

- Ahi
ahi is offline   Reply With Quote