Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > More E-Book Readers > iRex

Notices

Reply
 
Thread Tools Search this Thread
Old 09-17-2006, 02:18 PM   #1
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,327
Karma: 590871
Join Date: May 2004
Location: Pittsburgh, PA
Device: Note3, MacBook Air
Tutorial: Convert Gutenberg texts for the iLiad

Project Gutenberg is the primary online repository for public domain book text. The size and quality of the collection is staggering. So it's no surprise that the availability of these books is often one of the driving motivations for purchasing an e-ink electronic book reading device. One can get many of these texts already formatted for your device at ManyBooks. This would be the natural first place to look for iLiad owners, or Sony Librie owners as well. And outside of the e-ink domain, there are also iPod notes formats and pda formats as well.

But suppose the book is not available already packaged up in pretty form for the iLiad? You could certainly just read the text form, but you might just want to create an iLiad version yourself. If that's what you want to do, then be sure to check out the very thorough conversion tutorial over at TeleRead Blog.

In this step by step tutorial of converting from Gutenberg texts to iLiad-formatted pdfs, Branko Collins guides you through requirements (it looks to me like you don't have to be on Windows OS), acquisition of the texts, reformatting them, adding page numbers and producing the final result.
Branko points out that the process is also flexible enough to support creation of other e-book formats or even paper books from the pdfs.

This is not the sort of thing that you will want to do if you just want to read immediately. But if you want to produce a quality document for a better reading experience, this might just be the ticket for you. It may also help you to understand why a publisher might charge customers to offer a nicely created public domain e-book. An e-book is more than the words contained within it.

As we find people starting to invest the time to create well produced public domain (or privately authored) e-books, we'll also have to start discussing how these e-books can best be shared with the world. I'm not sure what ManyBook's policies are with respect to submissions, but I do know that if this work is invested into creation of e-books, the results deserve to be shared with all!
Bob Russell is offline   Reply With Quote
Old 09-18-2006, 10:07 AM   #2
yokos
Pac-Man catched my iLiad.
yokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with others
 
yokos's Avatar
 
Posts: 720
Karma: 2571
Join Date: Apr 2006
Location: Germany; next to Baltic Sea
Device: 1st gen iRex iLiad with 2nd ed. battery/case
Branko Collins' method is for my taste much to long & this is a crux: a manual method. I can be much better. Gutenmark was discussed in this forum. It works great. Gutmarks does all conversion you want in an automated way. ;.)
Code:
gutenmark --latex in.txt out.tex
or
Code:
gutenmark in.txt out.html
[Edit]: There are versions of gutenmark for Windows, Mac, & Linux.
yokos is offline   Reply With Quote
 
Advertisement
Old 09-18-2006, 03:55 PM   #3
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
I am not sure why you would use Gutenmark to produce HTML files from books for which there are HTML files. It is those books that are the subject of my tutorial.

Please keep in mind that there are many features of the original printed books that are not or not adequately preserved in the TXT file, such as illustration, all-caps, bold print, italics, non-ASCII characters such as wide dashes and curly quotes, et cetera.

My tutorial was written for people who like lots of control, and who wish to have as good an output as possible. I have already written a follow-up article that explores working from TXT files, and that will mention the various automatic conversion tools, of which Gutenmark is only one.
branko is offline   Reply With Quote
Old 09-18-2006, 04:02 PM   #4
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
"I'm not sure what ManyBook's policies are with respect to submissions, but I do know that if this work is invested into creation of e-books, the results deserve to be shared with all!"

Manybooks is actively on the look-out for submission, but I am not sure they accept what PG would call "". Manybooks uses automatic conversion to produce many formats, and I am not sure manual conversions somehow fit into that process. I have already told Matthew McClintock that this is a shame, because his HTML versions of PG's e-texts are conversions from the TXT format, whereas PG often has beautiful and rich HTML versions.

The question you raise is a good one, though.

The Internet Archive will accept submissions in its Open Source Books section. Perhaps that would be a good spot to post beautified public domain e-books.

What do you think; would some kind of editing process be benificial, or would TIA's rating system be good enough to separate the good from the bad?

Perhaps the Mobileread wiki could be used to keep track of which versions are available at TIA?
branko is offline   Reply With Quote
Old 09-18-2006, 04:37 PM   #5
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,327
Karma: 590871
Join Date: May 2004
Location: Pittsburgh, PA
Device: Note3, MacBook Air
This is a great topic. As it's sort of a fork in the thread from the main point of the conversion tutorial and alternate conversion methods, I've created another thread for the portion of discussion related to sharing and making available the best converted public domain documents for e-book readers. Please feel free to join in the conversation and add your thoughts.

With respect to this ongoing thread, I'd like also if someone can find and share a link to that follow-up article that Branko mentioned with more information about conversions. Or is it written, but not yet published?

At any rate, let's continue the discussion about conversions here, but please make the jump to talk about how the converted documents might best be hosted and indexed.

Thanks!
Bob Russell is offline   Reply With Quote
Old 09-18-2006, 06:48 PM   #6
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
Quote:
Originally Posted by Bob Russell
With respect to this ongoing thread, I'd like also if someone can find and share a link to that follow-up article that Branko mentioned with more information about conversions. Or is it written, but not yet published?
The latter. These things take a lot of time to write (the first article was approx. ten hours), so it's not like I can crank one out every day. I need to watch House too, you know. Although the second article is largely written, I still need to do some editing and revision on it. Expect it somewhere around the end of this week.
branko is offline   Reply With Quote
Old 09-18-2006, 07:23 PM   #7
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,327
Karma: 590871
Join Date: May 2004
Location: Pittsburgh, PA
Device: Note3, MacBook Air
Quote:
Originally Posted by branko
Expect it somewhere around the end of this week.
Thanks Branko. We'll be looking for it!
Bob Russell is offline   Reply With Quote
Old 09-19-2006, 11:28 AM   #8
yokos
Pac-Man catched my iLiad.
yokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with othersyokos plays well with others
 
yokos's Avatar
 
Posts: 720
Karma: 2571
Join Date: Apr 2006
Location: Germany; next to Baltic Sea
Device: 1st gen iRex iLiad with 2nd ed. battery/case
Quote:
Originally Posted by branko
Please keep in mind that there are many features of the original printed books that are not or not adequately preserved in the TXT file, such as illustration, all-caps, bold print, italics, non-ASCII characters such as wide dashes and curly quotes, et cetera.
Please try to understand me. This is what gutenmark does - removing the lacks of "naked" Gutenberg txt-files. LaTeX is a almighty language to get great looking e-books.

Branko, I never wanted to criticize you in any harsh way.

I just start with Gutenbergs' txt files & end with LaTex's output format pdf. This method differs from yours, & it's ok. There are many ways to Rome.

Last edited by yokos; 09-19-2006 at 11:35 AM.
yokos is offline   Reply With Quote
Old 09-19-2006, 08:32 PM   #9
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
Quote:
Originally Posted by yokos
Branko, I never wanted to criticize you in any harsh way.
No offense taken. I just do not necessarily agree with everything you say. Makes the world that bit more interesting, don't you think?

Quote:
Originally Posted by yokos
Please try to understand me. This is what gutenmark does - removing the lacks of "naked" Gutenberg txt-files.
It can only do this for as far the TXT format allows it to do so. For example, the following sentence:

Code:
"That is a BOLD STATEMENT you make there."
could be rendered in a myriad of ways. The original printed sentence could have "BOLD STATEMENT" in italics, all-caps, small-caps, bold, and any mix of these. All we, and therefor Gutenmark, can infer from the sentence in the TXT file is that some form of emphasis was used in the original.

PG's recent HTML files however maintain most if not all of these details. Using Gutenmark to counter the shortcomings of plain TXT files is silly if you already have a rich format available that does not have these shortcomings in the first place.

Quote:
Originally Posted by yokos
LaTeX is a almighty language to get great looking e-books.
TeX is a great typesetting tool, and I wish more people used it. However, you are suggesting that Gutenmark+TeX is much faster than the method I describe. I am willing to bet that for 90% of the people who will only convert a few books, installing and learning how to use TeX on their Windows machines is going to take far more time than the method I outlined.
branko is offline   Reply With Quote
Old 09-26-2006, 07:56 PM   #10
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
Quote:
Originally Posted by branko
The latter. These things take a lot of time to write (the first article was approx. ten hours), so it's not like I can crank one out every day. I need to watch House too, you know. Although the second article is largely written, I still need to do some editing and revision on it. Expect it somewhere around the end of this week.
I am taking a bit longer than I initially expected. The article kept mushrooming into something bigger, so I decided to split it into three parts, the first of which should appear tomorrow.
branko is offline   Reply With Quote
Old 09-26-2006, 08:04 PM   #11
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,327
Karma: 590871
Join Date: May 2004
Location: Pittsburgh, PA
Device: Note3, MacBook Air
No problem. Am really looking forward to it, but do it on a time table that's good for you!
Bob Russell is offline   Reply With Quote
Old 09-27-2006, 07:27 PM   #12
branko
Connoisseur
branko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enoughbranko will become famous soon enough
 
Posts: 93
Karma: 549
Join Date: Jul 2006
Location: Amsterdam
Device: Palm Zire
The first part is a bit short, and was published at http://www.teleread.org/blog/?p=5566
branko is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
iLiad iLiad Full Development Tutorial Hamatole iRex Developer's Corner 5 10-12-2009 07:29 AM
Unable Convert Gutenberg TXT to Mobi ascherjim Calibre 4 06-23-2009 09:55 AM
What "Cleaning Up" Do Project Gutenberg Texts Need [closed] bowerbird Workshop 166 11-12-2007 06:01 AM
How to convert Gutenberg Books with linked chapters? SteffenH Sony Reader 3 05-21-2007 07:44 PM
Plucker and Gutenberg texts nilhorne Reading and Management 3 02-06-2007 10:57 AM


All times are GMT -4. The time now is 08:21 AM.


MobileRead.com is a privately owned, operated and funded community.