09-17-2006, 01:18 PM | #1 |
Recovering Gadget Addict
Posts: 5,381
Karma: 676161
Join Date: May 2004
Location: Pittsburgh, PA
Device: iPad
|
Tutorial: Convert Gutenberg texts for the iLiad
Project Gutenberg is the primary online repository for public domain book text. The size and quality of the collection is staggering. So it's no surprise that the availability of these books is often one of the driving motivations for purchasing an e-ink electronic book reading device. One can get many of these texts already formatted for your device at ManyBooks. This would be the natural first place to look for iLiad owners, or Sony Librie owners as well. And outside of the e-ink domain, there are also iPod notes formats and pda formats as well.
But suppose the book is not available already packaged up in pretty form for the iLiad? You could certainly just read the text form, but you might just want to create an iLiad version yourself. If that's what you want to do, then be sure to check out the very thorough conversion tutorial over at TeleRead Blog. In this step by step tutorial of converting from Gutenberg texts to iLiad-formatted pdfs, Branko Collins guides you through requirements (it looks to me like you don't have to be on Windows OS), acquisition of the texts, reformatting them, adding page numbers and producing the final result. Branko points out that the process is also flexible enough to support creation of other e-book formats or even paper books from the pdfs. This is not the sort of thing that you will want to do if you just want to read immediately. But if you want to produce a quality document for a better reading experience, this might just be the ticket for you. It may also help you to understand why a publisher might charge customers to offer a nicely created public domain e-book. An e-book is more than the words contained within it. As we find people starting to invest the time to create well produced public domain (or privately authored) e-books, we'll also have to start discussing how these e-books can best be shared with the world. I'm not sure what ManyBook's policies are with respect to submissions, but I do know that if this work is invested into creation of e-books, the results deserve to be shared with all! |
09-18-2006, 09:07 AM | #2 |
Pac-Man caught my iLiad.
Posts: 807
Karma: 3595
Join Date: Apr 2006
Location: Germany; next to Baltic Sea
Device: Boox Max Lumi, iRex iLiad (RIP)
|
Branko Collins' method is for my taste much to long & this is a crux: a manual method. I can be much better. Gutenmark was discussed in this forum. It works great. Gutmarks does all conversion you want in an automated way. ;.)
Code:
gutenmark --latex in.txt out.tex Code:
gutenmark in.txt out.html |
Advert | |
|
09-18-2006, 02:55 PM | #3 |
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
I am not sure why you would use Gutenmark to produce HTML files from books for which there are HTML files. It is those books that are the subject of my tutorial.
Please keep in mind that there are many features of the original printed books that are not or not adequately preserved in the TXT file, such as illustration, all-caps, bold print, italics, non-ASCII characters such as wide dashes and curly quotes, et cetera. My tutorial was written for people who like lots of control, and who wish to have as good an output as possible. I have already written a follow-up article that explores working from TXT files, and that will mention the various automatic conversion tools, of which Gutenmark is only one. |
09-18-2006, 03:02 PM | #4 |
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
"I'm not sure what ManyBook's policies are with respect to submissions, but I do know that if this work is invested into creation of e-books, the results deserve to be shared with all!"
Manybooks is actively on the look-out for submission, but I am not sure they accept what PG would call "". Manybooks uses automatic conversion to produce many formats, and I am not sure manual conversions somehow fit into that process. I have already told Matthew McClintock that this is a shame, because his HTML versions of PG's e-texts are conversions from the TXT format, whereas PG often has beautiful and rich HTML versions. The question you raise is a good one, though. The Internet Archive will accept submissions in its Open Source Books section. Perhaps that would be a good spot to post beautified public domain e-books. What do you think; would some kind of editing process be benificial, or would TIA's rating system be good enough to separate the good from the bad? Perhaps the Mobileread wiki could be used to keep track of which versions are available at TIA? |
09-18-2006, 03:37 PM | #5 |
Recovering Gadget Addict
Posts: 5,381
Karma: 676161
Join Date: May 2004
Location: Pittsburgh, PA
Device: iPad
|
This is a great topic. As it's sort of a fork in the thread from the main point of the conversion tutorial and alternate conversion methods, I've created another thread for the portion of discussion related to sharing and making available the best converted public domain documents for e-book readers. Please feel free to join in the conversation and add your thoughts.
With respect to this ongoing thread, I'd like also if someone can find and share a link to that follow-up article that Branko mentioned with more information about conversions. Or is it written, but not yet published? At any rate, let's continue the discussion about conversions here, but please make the jump to talk about how the converted documents might best be hosted and indexed. Thanks! |
Advert | |
|
09-18-2006, 05:48 PM | #6 | |
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
Quote:
|
|
09-18-2006, 06:23 PM | #7 | |
Recovering Gadget Addict
Posts: 5,381
Karma: 676161
Join Date: May 2004
Location: Pittsburgh, PA
Device: iPad
|
Quote:
|
|
09-19-2006, 10:28 AM | #8 | |
Pac-Man caught my iLiad.
Posts: 807
Karma: 3595
Join Date: Apr 2006
Location: Germany; next to Baltic Sea
Device: Boox Max Lumi, iRex iLiad (RIP)
|
Quote:
Branko, I never wanted to criticize you in any harsh way. I just start with Gutenbergs' txt files & end with LaTex's output format pdf. This method differs from yours, & it's ok. There are many ways to Rome. Last edited by yokos; 09-19-2006 at 10:35 AM. |
|
09-19-2006, 07:32 PM | #9 | |||
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
Quote:
Quote:
Code:
"That is a BOLD STATEMENT you make there." PG's recent HTML files however maintain most if not all of these details. Using Gutenmark to counter the shortcomings of plain TXT files is silly if you already have a rich format available that does not have these shortcomings in the first place. Quote:
|
|||
09-26-2006, 06:56 PM | #10 | |
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
Quote:
|
|
09-26-2006, 07:04 PM | #11 |
Recovering Gadget Addict
Posts: 5,381
Karma: 676161
Join Date: May 2004
Location: Pittsburgh, PA
Device: iPad
|
No problem. Am really looking forward to it, but do it on a time table that's good for you!
|
09-27-2006, 06:27 PM | #12 |
Connoisseur
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
|
The first part is a bit short, and was published at http://www.teleread.org/blog/?p=5566
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
iLiad iLiad Full Development Tutorial | Hamatole | iRex Developer's Corner | 5 | 10-12-2009 06:29 AM |
Unable Convert Gutenberg TXT to Mobi | ascherjim | Calibre | 4 | 06-23-2009 08:55 AM |
What "Cleaning Up" Do Project Gutenberg Texts Need [closed] | bowerbird | Workshop | 166 | 11-12-2007 05:01 AM |
How to convert Gutenberg Books with linked chapters? | SteffenH | Sony Reader | 3 | 05-21-2007 06:44 PM |
Plucker and Gutenberg texts | nilhorne | Reading and Management | 3 | 02-06-2007 09:57 AM |