Shiny New E-Book Gizmo: The Amazon Kindle


View Full Version : The Gutenburg converter


rcs1000
11-01-2006, 05:11 PM
Hi,

I've knocked up a little .NET applilcation (in IronPython, in case anyone cares) to convert Gutenberg text files into useful RTFs. You can choose font and justification, as well as easily setting author and title.

It's probably buggy as h*ll; if people could test it, and tell me if it is useful, then I'm happy to sort out any problems.

(Oh yes, you may need to install Microsoft .NET 2.0, although in most cases it will already be on your computer.)

Cheers,

Robert

heavyB
11-02-2006, 12:04 AM
Robert,

I like how fast & small this is. I've only test a few books and the justification is pretty slick. Going to RTF is pretty cool too, if anything, so you can set Title & Author.

Of course anytime you put something out there, someone's going to pipe up with requests or suggestions, so here I go: If the app could take a guess at the author & title from the file name and pre-populate the text fields, that's be neat, even it it wasn't so accurate, it'd be an easy cut & paste for the user.

More fonts would be slick too.

I've found no bugs as of yet... :)

Thanks for the tool!

rcs1000
11-02-2006, 02:45 AM
Nice idea re auto-populate. It shouldn't be too difficult (and once we have that working, then we can bulk convert books).

Re fonts: my only question is this: what fonts does the Reader come with? I've only noticed the one serif (Roman), and the one sans (Swiss)?

Cheers, Robert

igorsk
11-02-2006, 03:38 AM
The Reader uses the same fonts as Connect software: Swiss721 BT Roman, Dutch801 Rm BT Roman and Courier10 BT Roman. You can find them in "C:\Program Files\Sony\CONNECT Reader\Data\fonts".

rcs1000
11-03-2006, 06:04 AM
Wow; this is turning out to be a harder task than I thought.

I've been playing with using the Google APIs - passing a search on the name of the text file "sense30.txt", and trying to interpret the first result. But, after a few hours of this, I've realosed that this is a spectacularly stupid way of achieving the goal. I'm sure there is a better way...

rcs1000
11-03-2006, 01:40 PM
OK. Converter will now attempt to "auto populate" the Title and Author fields. It's not perfect - probably never will be - but it'll save you a bunch of time.

Next up: support for HTML versions, with italics, bold, etc.

coolblue
11-03-2006, 10:57 PM
Another thing I may add is that I'm able to get far more information on Sony reader from this site than I ever got from the Sony site.

OskiBear
11-24-2006, 05:54 PM
Awesome Job, Robert!

Much thanks!!!

Fugubot
11-25-2006, 10:40 AM
Just another thanks for making this useful tool available!

If you do add, HTML conversion capabilities, would you consider turning it into a Firefox extension? It would be great to be able to save the web page in a format that is easy to drop into the reader.

Thanks again.

rcs1000
11-29-2006, 02:25 AM
Hello all: HTML conversion capabilities coming along nicely. (Well, nearly nicely, I haven't worked out how to deal with tables and/or CSS yet, but we're getting there.)

Expect a new release tomorrow. Or at the worst on Friday!