Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 07-31-2012, 12:07 AM   #1
derangedhermit
Addict
derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.
 
Posts: 239
Karma: 1280000
Join Date: Oct 2010
Location: USA
Device: None
Converting Project Gutenberg books to ePub

I know that for many books they have computer-generated ePubs. I find them unsatisfactory.

I would like people who put effort into convert Project Gutenberg books to high quailtiy ePub to describe their methods, tools, etc.

- What format do you start with? Text (with the PG-unique markup)? HTML? ePub?
- Do you have a standard CSS file that you use? Mind sharing it?
- What tools do you use? Notepad++? Sigil?
- How hard do you work on collecting, editing, enhancing images?
- Do you use images of an original to work from?

What are the problems you run into, how do you deal with them, and how long does it take you to convert a book into a decent quality ePub with clean modern markup - a book you are satisfied with?
derangedhermit is offline   Reply With Quote
Old 07-31-2012, 03:34 AM   #2
StoryEnthusiast
K. C. Lee
StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.StoryEnthusiast ought to be getting tired of karma fortunes by now.
 
StoryEnthusiast's Avatar
 
Posts: 584
Karma: 3652522
Join Date: Jun 2012
Location: New Zealand
Device: Android phone
I'd like to know too. I have chosen and released many Gutenberg stories on my portal site and my next step is to turn them into ePub books. This will benefit my learning experience.
StoryEnthusiast is offline   Reply With Quote
Old 07-31-2012, 04:28 AM   #3
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by derangedhermit View Post
- What format do you start with? Text (with the PG-unique markup)? HTML? ePub?
If there is a hand-made HTML, I generally use that. Otherwise I choose the plain text version (in utf8 or latin1 encoding if available).

Quote:
- Do you have a standard CSS file that you use? Mind sharing it?
Sort of. I copy a CSS from one of my most recent projects (or another book from the same "series") and then add or remove things are needed. The CSS files I use can be found in any of the books I've uploaded (e.g. The Adventures of Tom Sawyer).

Quote:
- What tools do you use? Notepad++? Sigil?
vim, but I'm a nitpicker.

Quote:
- How hard do you work on collecting, editing, enhancing images?
I try to find a good scan in The Internet Archive, download the raw image files, rotate and crop the illustrations, remove the speckles and make sure the background is pure white (for black and white illustrations). Then I resize all illustrations by the same factor.

Quote:
- Do you use images of an original to work from?
Sometimes. If I don't find a good scan online and I have an original I can scan.
Jellby is offline   Reply With Quote
Old 07-31-2012, 04:45 AM   #4
AlexBell
Wizard
AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.AlexBell ought to be getting tired of karma fortunes by now.
 
AlexBell's Avatar
 
Posts: 3,413
Karma: 13369310
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
Quote:
Originally Posted by derangedhermit View Post
I know that for many books they have computer-generated ePubs. I find them unsatisfactory.

I would like people who put effort into convert Project Gutenberg books to high quailtiy ePub to describe their methods, tools, etc.

- What format do you start with? Text (with the PG-unique markup)? HTML? ePub?
- Do you have a standard CSS file that you use? Mind sharing it?
- What tools do you use? Notepad++? Sigil?
- How hard do you work on collecting, editing, enhancing images?
- Do you use images of an original to work from?

What are the problems you run into, how do you deal with them, and how long does it take you to convert a book into a decent quality ePub with clean modern markup - a book you are satisfied with?
For what it is worth most of the ePubs I have done for the MobileRead library were originally from Project Gutenberg HTML files. I could not agree more with your dislike of their ePub files.

MobileRead won't let me upload ebook.css; if you send me a private message with your email address I'll email it to you.

I use the Coffee Cup HTML editor, mainly because I've dabbled in web design in the past.

How does one measure 'How hard' one works with images? I certainly spend some time searching for images, and sometimes use the images in the PG ePub file. If the images are poor quality I spend some time trying to improve them, but I'm certainly not an expert. The images in the last ebook I did (The Story of Francis Cludde by Stanley J. Weyman) I think were too dark and had poor contrast, and I think I've improved them.

I have found original images and used them - cf Robinson Crusoe by Daniel Defoe. I certainly spend time searching for cover images. Or do you mean do I produce original images? Only if I have to, and can't find an image to use as a cover - but even then I can usually find an image to put text on for a cover - cf Civil Disobedience and Other Essays by Henry David Thoreau.

I hope this helps.

PS zipped ebook.css attached. Thank you Harry.
Attached Files
File Type: zip ebook.zip (1.5 KB, 205 views)

Last edited by AlexBell; 08-01-2012 at 02:54 AM.
AlexBell is offline   Reply With Quote
Old 07-31-2012, 10:58 AM   #5
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by AlexBell View Post
MobileRead won't let me upload ebook.css; if you send me a private message with your email address I'll email it to you.
ZIP it and upload the ZIP file.
HarryT is offline   Reply With Quote
Old 07-31-2012, 11:00 AM   #6
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by Jellby View Post
I try to find a good scan in The Internet Archive, download the raw image files, rotate and crop the illustrations, remove the speckles and make sure the background is pure white (for black and white illustrations). Then I resize all illustrations by the same factor.
Yes, "archive.org" is by far the best source for scanned books with images. That's where I get most of mine from, too. Its scans are also, of course, a good source of printed editions to proofread against. I certainly wouldn't trust a PG book without proofing it - especially an older one.
HarryT is offline   Reply With Quote
Old 08-01-2012, 11:33 PM   #7
derangedhermit
Addict
derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.
 
Posts: 239
Karma: 1280000
Join Date: Oct 2010
Location: USA
Device: None
Thanks for the replies. I respect y'all's work. Your comments mirror my beginning attempts: the PG books require proofing, and that takes time. Doing a proper ePub markup takes time - cleaning out the junk, mainly. Images add a lot, and are worth including, but often involve a fair bit of work to get to "publication quality".

So far it's about 40 hours of work for me to take an "average" PG book, clean it up and proofread it against another text, insert markup, clean up and insert images, and check the whole thing over. It's tough for me to think of doing many books at that rate, although I would like to.

How long, on average, does a book take for you?
derangedhermit is offline   Reply With Quote
Old 08-02-2012, 01:39 AM   #8
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
It depends on the length of the book, of course. I generally spend about 1-1.5 hours a day proofreading my books (I do it in bed at night ), during which time I'll typically get through perhaps 20-30 pages. So a 400 page book would take me maybe 15-20 days to proofread.
HarryT is offline   Reply With Quote
Old 08-02-2012, 06:56 AM   #9
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
At archive.org, there are often multiple copies of books. Their text quality is often similar, but the image quality varies considerably depending on whether it was scanned as black and white or grayscale. So if you don't like the first one you downloaded, check and see if there is another in grayscale or color.
mrmikel is offline   Reply With Quote
Old 08-05-2012, 11:33 AM   #10
derangedhermit
Addict
derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.derangedhermit ought to be getting tired of karma fortunes by now.
 
Posts: 239
Karma: 1280000
Join Date: Oct 2010
Location: USA
Device: None
Does anyone feed the proofread corrected copies back to PG? It seems like that would be a very good and low-effort thing to do.
derangedhermit is offline   Reply With Quote
Old 08-05-2012, 01:15 PM   #11
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by derangedhermit View Post
Does anyone feed the proofread corrected copies back to PG? It seems like that would be a very good and low-effort thing to do.
I've offered them some of my stuff, but they don't seem interested. Perhaps for legal reasons, because my sources are often not American editions.
HarryT is offline   Reply With Quote
Old 08-06-2012, 04:19 AM   #12
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by derangedhermit View Post
Does anyone feed the proofread corrected copies back to PG? It seems like that would be a very good and low-effort thing to do.
I submit the list of errors I find to PG. They sometimes apply the corrections, but it may take some time. They must check all corrections manually, ideally make sure they correspond to the same edition that was originally used, apply the changes to all the formats, etc. I guess they are short of manpower.

A much more effective of helping is Distributed Proofreaders, where the changes are made before the books are submitted to PG. There's a "smoothreading" stage for those that just want to read a book.
Jellby is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How are the mobi and epub files at Project Gutenberg? ficbot General Discussions 2 04-16-2010 06:57 PM
Importing Books from Project Gutenberg dhume01 Calibre 9 02-04-2010 12:04 PM
Project Gutenberg books ALL available in LRF coolbooks LRF 28 12-23-2009 06:40 PM
EPUB books now available at Project Gutenberg Kris777 News 13 03-28-2009 12:49 AM
SciFi e-books at Project Gutenberg Bob Russell Deals and Resources (No Self-Promotion or Affiliate Links) 2 08-24-2006 09:42 PM


All times are GMT -4. The time now is 09:24 PM.


MobileRead.com is a privately owned, operated and funded community.