View Full Version : Project Gutenberg books ALL available in LRF


coolbooks
02-21-2009, 03:00 PM
Hi,

I just bought a Sony PRS505 and absolutely love it. The screen quality is outstanding and it has really put back enjoyment into my reading.

I wanted to get more eBooks onto my Sony so I developed some software to convert from Project Gutenberg books into Sony LRF format.

You can try it out at www.coolfreebooks.com (http://www.coolfreebooks.com)

Please remember that this is being provided free of charge although I will obviously respond to emails etc I cannot 100% guarantee that all enhancements will be implemented.

Regards,

Trevor

FizzyWater
02-21-2009, 03:37 PM
Hi, Trevor! Nice of you to post your site!

There are lots of PG books already converted for Sony here in the eBook Uploads and on Feedbooks and Manybooks, too. If you're looking for new books to convert, you might want to check there first. And if you're converting books that aren't available here, you might want to upload them here, too...

Elsi
02-21-2009, 04:03 PM
Hello coolbooks -- thanks for posting this message. Your contribution may prove very valuable to the Sony owners here.

I've moved your post from our uploads forum -- since that forum should contain messages with book files attached -- to the LRF formatting forum. I've also deleted the duplicate messages you posted.

We're glad you've joined MobileRead and plunged right in offering converted books. Please join the various conversations and share what you've learned about formatting for the Sony.

coolbooks
02-21-2009, 04:21 PM
Hi, Trevor! Nice of you to post your site!

There are lots of PG books already converted for Sony here in the eBook Uploads and on Feedbooks and Manybooks, too. If you're looking for new books to convert, you might want to check there first. And if you're converting books that aren't available here, you might want to upload them here, too...

What my program does is to allow you to search Gutenberg. Then, once you have selected a book it creates a brand new LRF file which is then offered as a download.

This means that I do not have to choose which books to convert since they are all done dynamically and on demand.

Regards,

Trevor

RWood
02-21-2009, 04:25 PM
Interesting site. It does seem though that they are mechanical (automated) LRF creations of the original Project Gutenberg texts. PG conventions like "_italic text_" are shown rather than "italic text" and the few I examined had a listing for a Table of Contents with no entries in the TOC.

While I applaud the effort, how do the books on your site differ from the books on Manybooks?

JSWolf
02-21-2009, 04:26 PM
What my program does is to allow you to search Gutenberg. Then, once you have selected a book it creates a brand new LRF file which is then offered as a download.

This means that I do not have to choose which books to convert since they are all done dynamically and on demand.

Regards,

Trevor
Does your program put in a proper ToC, convert _text_ to italic and -- to em dash? Will it use the HTML as the source if one exists?

This does sound good.

coolbooks
02-21-2009, 04:40 PM
Hi,

Sorry, I don't know what ManyBooks is .. if you could please give me a link I will take a look. The idea behind my site is to provide access to all the PG books as and when people require them, it is not to store all books in LRF format.

As for the Italics .. I have to admit that I had not done my homework with regards to this. I will add in Italics and emphasis tonight.

For the table of contents I will have to check how consistent the PG books actually are, once I have established this then I can create TOC.

In the process of creating a LRF book I actually do some work on the .txt file and convert it into html so I could use the html versions and cut out a step.

It is a work in progress but I am a very competent programmer and I am confident that I can do almost anything required to create perfect LRF books.

Regards,

Trevor

BruceB
02-21-2009, 05:09 PM
How about adding some options to let me choose font size, margins, etc. in the likely event that I have different desires for those than the last person to format the book.

nrapallo
02-21-2009, 05:20 PM
Hi,

Sorry, I don't know what ManyBooks is .. if you could please give me a link I will take a look. The idea behind my site is to provide access to all the PG books as and when people require them, it is not to store all books in LRF format.

Here is a link http://manybooks.net/about/ . They offer almost all the PG books in various download formats.

As for the Italics .. I have to admit that I had not done my homework with regards to this. I will add in Italics and emphasis tonight.

That would be a nice addition!

For the table of contents I will have to check how consistent the PG books actually are, once I have established this then I can create TOC.

Automatic TOC generation is tricky (Mobipocket Creator/Calibre do it using RegEx and HTML tags; BookDesigner does this also very well). It would improve the usefulness of your conversions.

In the process of creating a LRF book I actually do some work on the .txt file and convert it into html so I could use the html versions and cut out a step.

Could you allow the "intermediary" .html to be offered for download as well. That could be used to convert to other formats, like .prc/.mobi using mobigen or .epub using Calibre.

It is a work in progress but I am a very competent programmer and I am confident that I can do almost anything required to create perfect LRF books.

Regards,

Trevor

Thanks for sharing this!

How do you convert .html to .lrf? In PDFRead, I used the python pylrs package by Mike Higgins (Falstaff) to produce .lrf from .html containing primarily images and not a lot of text.

Have you seen this similar (but over-the-top) effort before: Converting Project Gutenberg books to SONY Reader (http://1-800-magic.blogspot.com/2008/01/converting-project-gutenberg-books-to.html)?

Here's the direct link to Project Gutenberg for the SONY Reader devices (http://www.solyanik.org:9000/). :cool:

RWood
02-21-2009, 05:25 PM
ManyBooks.net is the URL. From their About page:

All of the eBooks from manybooks.net are free, however donations toward the maintenance of the site are welcome.

Many of the etexts are from the November, 2003 Project Gutenberg DVD, which contains the entire Project Gutenberg archives except for the Human Genome Project and audio eBooks, due to size limitations, and the Project Gutenberg of Australia eBooks, due to copyright. As of July 2004 most current PG texts are available here, usually within the week of release. There are also public domain and creative commons works from other sources.

Zip archives are stored in the same directory structure as on the DVD, with Author, Title, and related information stored in a MySQL database. Pages are built with PHP and served using the Apache webserver. The server is a 1.66GHz Intel Core Duo Mac mini running Mac OS X 10.4, located at macminicolo.net.

eBooks are generated on demand using a variety of tools, and cached for future readers - which means that the first time anyone requests an eBook in a particular format it will take a bit longer to deliver, but the next time that eBook is requested it will be sent immediately.

Interesting Statistics


Popular formats - a running count of the relative popularity of each ebook format.
Most Popular eBooks of 2008
Most Popular eBooks of 2007
Most Popular eBooks of 2005
Most Popular eBooks of 2004

Conversion tools:


* gut or txt2html for text -> html.
* PHP-PDB for Palm Doc.
* Pyrite Publisher. (might be gone - see Freshmeat project page for some information.)
* Plucker.
* iSilo386.
* makedoc.
* rbmake for RocketBook format.
* makeztxt for ztxt format.
* htmldoc for HTML -> PDF conversion.
* HTMLtidy for fixing up the automatic HTML generated by "gut".
* TCR.
* makelrf 0.3
* makebook for eReader format.
* methods originally developed by Daniel Duris for iPod Notes conversion.
* Apple's textutil command for RTF.
* Desktop Connection Library for Newton Paperback format.
* Mobigen.exe for Mobipocket format.
* Wine to run mobigen.exe on OS X.
* Componnents of mjBook for cellphone ".jar" files
* HAWHAW toolkit for the WAP site http://mnybks.net.
* Calibre for ePUB format.


You can contribute to the digitization of public domain books by participating in Distributed Proofreading.

JenPen1
02-21-2009, 06:25 PM
Hi Coolbooks

I am a newbie, but wanted to let you know that I had just tried out your site. Great job. :thumbsup:
FYI there is another website besides www.manybooks.com that you can look at to download books for your PRS505. It makes several different formats available, including LRF. They are always bringing in new books. The site is www.feedbooks.com

Jen

mtravellerh
02-21-2009, 06:46 PM
There are no LRFs available on Feedbook, only custom sized PDF files and epub (oncerning the Sony 505)

JenPen1
02-21-2009, 07:24 PM
There are no LRFs available on Feedbook, only custom sized PDF files and epub (oncerning the Sony 505)

My error, I was thinking of ePub for Sony and typed in LRF. Thanks for bringing that to our attention, mtravellerh. :)

coolbooks
02-21-2009, 09:33 PM
Great info, thanks.

Where I think I can really add value is in the ability to customize your LRF files. You can set almost any parameter such as font size and margins.

I have had a look at manybooks.net and I can see that they offer books in LRF as well. I don't want to recreate what they have done so your input is very much appreciated.

I have fixed italics but I am confused by emphasis. -- does not always have a termination of -- to indicate that emphasis should stop. In html version of Emma by Jane Austen for instance, the html does not contain any emphasis. I would like some advise on how to implement emphasis, I cannot see anything in the Project Gutenberg documentation.

Regards,

Trevor

coolbooks
02-22-2009, 09:24 PM
Hi everyone,

I have added the ability to set margins and text size now.

Hope you like it, I will be doing some more work on allowing you to set more parameters (such as font name).

www.coolfreebooks.com (http://www.coolfreebooks.com)

Regards,

Trevor

DDHarriman
02-24-2009, 11:22 AM
Hi Trevor

Great work!

Best,

BruceB
02-24-2009, 04:58 PM
Agreed, it's *GREAT*. What are the units for the margins? Percent of screen? pixels? inches?

BruceB
02-24-2009, 05:04 PM
Another question best illustrated with an example.

Go to the website. Enter "verne" (no quotes) for the author and "island" (also no quotes) for the title. Select the first one ("The Secret of the Island").

What's going on with pages 4-9???

coolbooks
02-25-2009, 07:27 PM
Another question best illustrated with an example.

Go to the website. Enter "verne" (no quotes) for the author and "island" (also no quotes) for the title. Select the first one ("The Secret of the Island").

What's going on with pages 4-9???

In order to denote Italics you should use _Italics_ . This book contained a whole line of _'s. Therefore when I replaced them with <I></I> I ran into problems. I have now fixed this so please download the book again.

There will no doubt be more cases where the output is not correct. Please notify me when you come upon such a book and I will correct the problem.

Regards,

Trevor

coolbooks
02-25-2009, 08:00 PM
Hi Everyone,

em-dash is now implemented.

Regards,

Trevor

coolbooks
02-25-2009, 08:46 PM
Agreed, it's *GREAT*. What are the units for the margins? Percent of screen? pixels? inches?

The units for margins is pixels.

I am a little more uncertain about exactly what the font size relates to but 100 is quite small, 150 is great for children.

Regards,

Trevor

coolbooks
02-27-2009, 06:28 PM
I have also now posted a Sudoku player on the site (my main line of business is in puzzles).

Regards,

Trevor

igorsk
02-27-2009, 06:49 PM
Please check the site in Opera, I get cut off tables. You might want to look into GutenMark (http://www.sandroid.org/GutenMark/) for better results. Also, your books are uncompressed which makes them unnecessarily large. Why not use Calibre for conversion?

coolbooks
02-27-2009, 07:40 PM
I will check regarding the cut off tables.

My rationale behind the site is that I wanted to make the process of getting books onto the Sony easier. I aim to make the whole process as simple as possible - so simple that my dad can put books onto his reader without having to ask me for assistance.
I am now looking at taking html books as the preferred source, this will give me better results. Also I will be honing the conversion so that even the .txt files are converted to produce very neat books.
Even as the conversions are at the moment, I am happy with the results. It provides me with books that I can read with the minimum of effort. Maybe my expectations are not that high but I am confident that the results I achieve will continually improve.

Regards,

Trevor

igorsk
02-28-2009, 12:00 PM
I mean you could use Calibre's LRF converter as back-end for your site. This would give you properly compressed files. It also already handles HTML input.

jimad
03-19-2009, 06:35 PM
Project Gutenberg is starting to offer their books in directly Sony Reader compatible EPUB format directly, although the efforts still somewhat show their "experimental" nature.

Kindle users can convert to Amazon format via Stanza or Calibre.

gregcd
04-24-2009, 01:00 AM
Great job! I tried this with Moby Dick and custom settings.

The line spacing looks good, it appears a little bit more spaced than calibre books.

This is great for loading books on to SD card while on holiday (I'm not a big fan of the epub font)

teemee
12-23-2009, 04:50 PM
how can i get the membership of this www.coolfreebooks.com?

ficbot
12-23-2009, 06:40 PM
I have tried the LRF books from Manybooks.net and find they have excessive line breaks for me. I found I get the best results with downloading the HTML from PG and converting it with Calibre. I am still experimenting---most of the books I want are available here and I have been going to PG for French stuff. In one case, the HTML looked beautiful but the accents didn't work when I converted it.