Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-07-2009, 12:00 PM   #1
levi_john
Junior Member
levi_john began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jan 2009
Device: Sony PRS505
Project Gutenberg

When I download books from Project Gutenberg onto my Sony PRS505 the text often does not fill the full width of the page. For example one line may be the full width ,whilst the next line down might only have one word on it and then the next line down will again be the full width. I can read it like this but clearly it is not right. Can anyone help

Last edited by levi_john; 01-07-2009 at 12:02 PM. Reason: typing error
levi_john is offline   Reply With Quote
Old 01-07-2009, 12:15 PM   #2
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 72,676
Karma: 310349502
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
The Project Gutenberg texts are formatted with fixed line lengths, rather than as logical paragraphs. You'll need to do a little reformatting to get them to read well on your Sony.

First - check the ebook forums here to see if someone has already done a good conversion.

If not - check to see if there's an HTML version at Projewct Gutenberg - this will almost certainly work better than plain text

If you must - open the text up in a text editor, and do a few global search and replaces:

(1) search for two new line characters in a row, and replace with "[paraend]"
(2) search for any new line character, and replace with a space
(3) search for [paraend] and replace with a new line character
(4) search for two spaces in a row and replace with a single space.

That should make your reading experience a lot better.


Quote:
Originally Posted by levi_john View Post
When I download books from Project Gutenberg onto my Sony PRS505 the text often does not fill the full width of the page. For example one line may be the full width ,whilst the next line down might only have one word on it and then the next line down will again be the full width. I can read it like this but clearly it is not right. Can anyone help
pdurrant is offline   Reply With Quote
Advert
Old 01-07-2009, 02:57 PM   #3
BookishDreamer
Cultural Artist
BookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterBookishDreamer can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
BookishDreamer's Avatar
 
Posts: 1,128
Karma: 12829
Join Date: May 2008
Location: Georgia
Device: Sony 505, Kindle 2
Welcome to MobileRead, levi_john! Bad line formating can make reading a miserable experience. As pdurrant mentioned, be sure to check out the Ebooks here. Our members are very conscientious about uploading properly formatted books for the reading pleasure of others.

Dreamer
BookishDreamer is offline   Reply With Quote
Old 01-07-2009, 03:21 PM   #4
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
There are several ways round this problem.
1. Paste the text file into a doc and run stingo's word macro, which you can find here:
https://www.mobileread.com/forums/showthread.php?t=8793
2. Try putting the text into BookCreator and tidying it there very quickly. It is here:
https://www.mobileread.com/forums/showthread.php?t=28313
3. Run it through Gutenmark (google to find it).

3. Or do a search and replace as pdurrant describes. I do a similar process:
I paste the text file into a word doc, then click on "edit" and open "Find and Replace."

1. search for ^p^p. (Click on "more" and "special" to find the paragraph symbols.)Replace with ## or @@ or . (This is a placeholder for the paragraph breaks. You want to strip out the single linebreaks but leave the genuine paragraphs.)
2. Search for ^p. Replace with a line space.
3. Search for ##. Replace with paragraph mark.
This should sort the text out. Save as an rtf, or just leave as a doc. The reader will convert it automatically.
Patricia is offline   Reply With Quote
Old 01-07-2009, 05:02 PM   #5
Elsi
Wizard
Elsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of light
 
Elsi's Avatar
 
Posts: 2,366
Karma: 12000
Join Date: Jan 2008
Location: Texas, USA
Device: Kindle; Sony PRS 505; Blackberry 8700C
I use a clever program that I downloaded which will take care of most of the PG text problems. It's called "GutenMark" and comes from http://www.sandroid.org/GutenMark

GutenMark outputs HTML or LaTeX. After converting the text file to HTML with GutenMark, I then follow HarryT's directions for using BookDesigner and then MobiPocket creator to create MobiPocket books. Next, I use Calibre to create the Sony BBeB format and ePub. Lastly, I feed the MobiPocket book into Mobi2IMP and create the .imp formats.

If I'm going to be converting a book for my own purposes, I go ahead and create all the formats and upload here. I figure someone else might be interested.

Update: I missed Patricia's mention of GutenMark. Sorry.

Last edited by Elsi; 01-07-2009 at 05:05 PM.
Elsi is offline   Reply With Quote
Advert
Old 01-07-2009, 08:44 PM   #6
MickeyC
Grand Sorcerer
MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.MickeyC ought to be getting tired of karma fortunes by now.
 
MickeyC's Avatar
 
Posts: 16,731
Karma: 12185114
Join Date: Nov 2007
Location: Florida
Device: iPhone 6 plus, Sony T1, iPad 3
Hi levi_john amnd welcome to the Forum
MickeyC is offline   Reply With Quote
Old 01-09-2009, 04:02 AM   #7
levi_john
Junior Member
levi_john began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jan 2009
Device: Sony PRS505
Quote:
Originally Posted by levi_john View Post
When I download books from Project Gutenberg onto my Sony PRS505 the text often does not fill the full width of the page. For example one line may be the full width ,whilst the next line down might only have one word on it and then the next line down will again be the full width. I can read it like this but clearly it is not right. Can anyone help
Thanks for all of the responses on sorting out the text on my Sony Reader.

I have just tried the HTML approach and it works well and will shortly be trying out the others. Thanks. John
levi_john is offline   Reply With Quote
Old 02-25-2009, 09:48 PM   #8
coolbooks
Member
coolbooks has a complete set of Star Wars action figures.coolbooks has a complete set of Star Wars action figures.coolbooks has a complete set of Star Wars action figures.
 
Posts: 14
Karma: 268
Join Date: Feb 2009
Device: Sony PRS505
I have created a site that takes books from Gutenberg and creates lrf format books on the fly. You can also set margins and font size.

This is provided free of charge at www.coolfreebooks.com

Regards,

Trevor
coolbooks is offline   Reply With Quote
Old 02-26-2009, 12:02 AM   #9
Andybaby
Wizard
Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.
 
Andybaby's Avatar
 
Posts: 1,279
Karma: 1002683
Join Date: Nov 2008
Location: New York
Device: PRS-700
Quote:
Originally Posted by Elsi View Post
I use a clever program that I downloaded which will take care of most of the PG text problems. It's called "GutenMark" and comes from http://www.sandroid.org/GutenMark

GutenMark outputs HTML or LaTeX. After converting the text file to HTML with GutenMark, I then follow HarryT's directions for using BookDesigner and then MobiPocket creator to create MobiPocket books. Next, I use Calibre to create the Sony BBeB format and ePub. Lastly, I feed the MobiPocket book into Mobi2IMP and create the .imp formats.

If I'm going to be converting a book for my own purposes, I go ahead and create all the formats and upload here. I figure someone else might be interested.

Update: I missed Patricia's mention of GutenMark. Sorry.
for anyone having similar problems. this is probably the easiest way
Andybaby is offline   Reply With Quote
Old 02-26-2009, 02:31 AM   #10
phenomshel
ZCD BombShel
phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.
 
phenomshel's Avatar
 
Posts: 4,793
Karma: 8293322
Join Date: Jan 2009
Location: The Frozen North (aka Illinois, USA)
Device: iPad, STB Kindle Oasis
Quote:
Originally Posted by Patricia View Post
There are several ways round this problem.
1. Paste the text file into a doc and run stingo's word macro, which you can find here:
https://www.mobileread.com/forums/showthread.php?t=8793
2. Try putting the text into BookCreator and tidying it there very quickly. It is here:
https://www.mobileread.com/forums/showthread.php?t=28313
3. Run it through Gutenmark (google to find it).

3. Or do a search and replace as pdurrant describes. I do a similar process:
I paste the text file into a word doc, then click on "edit" and open "Find and Replace."

1. search for ^p^p. (Click on "more" and "special" to find the paragraph symbols.)Replace with ## or @@ or . (This is a placeholder for the paragraph breaks. You want to strip out the single linebreaks but leave the genuine paragraphs.)
2. Search for ^p. Replace with a line space.
3. Search for ##. Replace with paragraph mark.
This should sort the text out. Save as an rtf, or just leave as a doc. The reader will convert it automatically.
Thank you! This is what I've been trying to ask about and wasn't smart enough to phrase it so y'all knew what I was talking about. I'm 20 pages into re-formatting a book in Word now - this should make the remaining 277 pages a LOT easier.
phenomshel is offline   Reply With Quote
Old 02-26-2009, 05:11 PM   #11
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
I hope that it helps, shel.
Don't hesitate to ask about this sort of problem: that's precisely what the workshop section of the forum is for.
Patricia is offline   Reply With Quote
Old 03-31-2009, 04:30 PM   #12
jimad
Connoisseur
jimad is on a distinguished road
 
Posts: 53
Karma: 52
Join Date: Apr 2008
Device: Kindle
Examples of the "Gutenmark" approach abound at http://www.freekindlebooks.org -- which also has the start of a "mirror" of project gutenberg in MOBI format at http://www.freekindlebooks.org/MobiM...obimirror.html

The approach I would recommend for the "casual" user -- ie the personal reader of Project Gutenberg texts rather than someone who wants to republish those PG texts -- is to get the EPUB version of the book from PG and then run the Calibre "any2mobi" command line tool on that file. It takes me literally about two minutes to pick out a book from PG and do this conversion.

For Kindle I wrote a little command line batch file to call any2mobi with Kindle-centric parameters:

"c:\program files\calibre\any2mobi.exe" --no-justification --cover ..\pg.jpg --dest-profile Kindle %1

where "pg.jpg" is a little dummy cover I insert because I do not like how any2mobi tries to gin up a dummy cover.
jimad is offline   Reply With Quote
Old 03-31-2009, 07:38 PM   #13
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
I just use a handful of macros and a lot of find+replace. Then I proofread. If the source is The Internet Archive or Google Books then a close proofreading is essential. With new PG texts, this can be abbreviated considerably.

Currently, I'm doing some Laura Ingalls Wilder. In my source file for "On the Banks of Plum Creek", "Creek" usually appears as "Greek." This is weird, but quickly replaced.
Patricia is offline   Reply With Quote
Old 04-01-2009, 05:44 AM   #14
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by pdurrant View Post
The Project Gutenberg texts are formatted with fixed line lengths, rather than as logical paragraphs. You'll need to do a little reformatting to get them to read well on your Sony.

First - check the ebook forums here to see if someone has already done a good conversion.

If not - check to see if there's an HTML version at Projewct Gutenberg - this will almost certainly work better than plain text

If you must - open the text up in a text editor, and do a few global search and replaces:

(1) search for two new line characters in a row, and replace with "[paraend]"
(2) search for any new line character, and replace with a space
(3) search for [paraend] and replace with a new line character
(4) search for two spaces in a row and replace with a single space.

That should make your reading experience a lot better.
Thanks for a crystal clear explanation (OK I am a little late).

May I add that it is easy enough to wrap together all these four points above in a plain macro and that'it.
roger64 is offline   Reply With Quote
Old 04-07-2009, 07:39 PM   #15
jimad
Connoisseur
jimad is on a distinguished road
 
Posts: 53
Karma: 52
Join Date: Apr 2008
Device: Kindle
Project Gutenberg is now directly supporting (on an "Experimental" meaning sometimes buggy basis) two E-Book formats: EPUB and MOBI [and HTML if you want to include that one too]

And/or use Calibre especially its command line tools to turn these three formats into other E-Book formats if need be.
jimad is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Project Gutenberg Australia ballast Deals and Resources (No Self-Promotion or Affiliate Links) 9 07-31-2010 05:18 PM
Gutenberg Project DVD Red Dragon LRF 0 02-14-2010 09:52 AM
What's wrong with Project Gutenberg? mtravellerh News 13 04-22-2009 04:17 AM
HTML from Project Gutenberg? Rcartes Sony Reader 10 04-21-2009 08:26 PM
Project Gutenberg Goes Mobile Robotech_Master News 1 02-06-2009 07:08 PM


All times are GMT -4. The time now is 02:59 PM.


MobileRead.com is a privately owned, operated and funded community.