View Single Post
Old 06-02-2009, 12:29 AM   #2
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
FAQ

Reserved for FAQ/Tutorial.

1. Do I have to use the Windows GUI?
No, that's optional, but it is easier than remembering and manually typing the required switches. The Perl script came first and is perfectly useable on it's own. The 'samples' directory in the GuteBook Install directory shows some 'command line' examples which can used with the Perl script i.e.
Code:
call ..\bin\do-ge 28700 --1200 --lrf  "--LRmargins 2px" --keepzip --keephtm --pbfirsth1 --smaller

i.e. gutebook.pl 28700 --1200 --lrf --LRmargins 2px --keepzip --keephtm --pbfirsth1 --smaller   >gutebook.log
where 'do-ge' is a bat/shell script utilizing gutebook.pl with the sample conversion of PG EText-No. 28700 with 2px (L/R) margins, smaller text and retain both the .zip downloaded and PG original .htm and produce a REB1200 .imp and Sony PRS .lrf with a pagebreak on the first <h1> heading used within .htm. (Sorry for the run-on sentence...)
2. Do I have to use/select an ebook output format?
No, but then GuteBook will only prepocess the .htm and will not create any ebooks nor setup the batch file to be used to re-generate the ebooks after re-editing the modified .htm. However, you can use the resulting .opf with, say, Mobipocket Creator and manually generate a .mobi and then feed that to calibre or any other mobi2... program (like Mobi2IMP). The choice is yours!
3. How do I download a Project Gutenberg Australia book?
Quick HOW-TO example:
  1. on the main GUI screen click the blue text at the bottom right '^PGA List' and your browser will open with the PG Australia GUTINDEX_AUS.htm.
    (when using the Perl script, refer to the GUTINDEX_AUS.htm file in the doc directory in your Install directory).

  2. find your book of interest therein and then note it's EText-No. and HTML URL link, e.g., The Robe, by Lloyd C Douglas (0364A & http://gutenberg.net.au/ebooks04/0400561h.html ).

  3. copy those two item into the GUI main screen boxes for ETEXT-No. and Input File respectively. Don't leave out the 'A' suffix on the EText-No. as that identifies the file as a PG Australia ebook for processing within GuteBook.
    (when using the Perl script add: --PGnum 0364A "http://gutenberg.net.au/ebooks04/0400561h.html" ).

  4. fill out the GUI main and options screens. See this post for screenshots.

  5. click convert

  6. enjoy, but you may need to re-edit this file as any font size reduction has no effect since the <p>'s were fixed at 14pt.

Output results:
Code:
Command Line
============

"C:\Program Files\GuteBook\bin\gutebook" --PGnum 0364A "http://gutenberg.net.au/ebooks04/0400561h.html" --epub --lrf --1200 --1150 --smallerfont --search "<h1>(<a name=.*?)</h1>" --replace "<h2>$1</h2>" --modi --modg

GuteBook (version 0.4) Copyright (C) 2009 Nick Rapallo (nrapallo)
Getting "0364A" HTML file from Project Gutenberg (Australia) Website
Please Wait... Downloading
.
http://gutenberg.net.au/ebooks04/0400561h.html saved to C:\Program Files\GuteBook\0364A\0400561h.xhtml
Renamed .xhtml to .htm
.

Book Title : The Robe (1942)
Author     : Lloyd C. Douglas
eBook No.  : 0400561h.html
Language   : English
Released   : July 2004

Cleaning "0364A" HTML...
Wrote cleaned HTML "C:\Program Files\GuteBook\0364A\0400561h.htm"
Press any key to continue . . .
p.s. For comparison, there are already .lrf/.prc versions of this ebook done in 2007 by BenG.
4. Troubleshooting: Why doesn't GuteBook find or download my requested book?
Etext-No.'s below 10,000 are sometimes problematic as many of the earlier etext no.'s don't follow the current/normal filenaming pattern of http://www.gutenberg.org/files/EXTEXTNO/EXTEXTNO-h.zip.

Let's say you've entered the number 7471 as the book listed on PG (it's a collection of short stories by P. G. Wodehouse).

But GuteBook fails to find, download and convert the file. It's nothing you've done wrong, it's just that this ebook doesn't follow the normal filename pattern and needs to be overridden by placing the following link in the Input File box: http://www.gutenberg.org/dirs/etext05/2left10.zip . Just so that you know, I got that link from the Gutenberg ebook page for Etext-No. 7471 and copying the link to the .zip text (or html) version.

Also, since that ebook is just available as text, you will need GutenMark (GUItenMark) installed and selected on the first page (see GUI screenshot). This ebook needs to be converted internally to html using GutenMark so that GuteBook can produce an ebook version.

Try it again, as before, but just override the Input File, in this case.

Last edited by nrapallo; 02-25-2010 at 04:01 PM.
nrapallo is offline   Reply With Quote