Putting the issue of hyphenation aside, doesn't Gutenmark pretty do what you're looking for? (
http://www.sandroid.org/Gutenmark). Gutenmark will take a Gutenberg text file and give you LaTex output. From my own experience, the text file has to follow Gutenberg's format (I'm not even sure what that means, since it's not an encoding issue and a text file is a text file....) otherwise the output may not work.