12-14-2006, 12:01 PM | #16 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
More updates: Small fix for the BODY tag not being picked up correctly and causing crashes.
Thanks silkpag for the bug report. |
12-18-2006, 04:02 PM | #17 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Update: Added option to specify LRF output filename
|
Advert | |
|
12-18-2006, 04:40 PM | #18 |
Wizard
Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
|
Say, would it be too hard to adapt it for Baen's HTML books?
Last edited by igorsk; 12-18-2006 at 04:46 PM. |
12-19-2006, 08:23 AM | #19 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Would be hard to do, the Baen books aren't formatted very well in HTML and gutlrf only really works with single HTML files (theirs are split). The RTF versions look to have the best formatting though.
|
12-19-2006, 09:00 AM | #20 |
Gizmologist
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
|
For Baen's offerings, I've been quite happy with taking the RTFs and bumping the font size up. I also change the font face, but that's just preference.
|
Advert | |
|
12-19-2006, 09:05 AM | #21 |
Wizard
Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
|
Well, their RTFs are definitely readable, but it would be nice to have a TOC...
|
12-20-2006, 05:32 AM | #22 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Not to mention that many of them have useful pictures (maps, etc) but I guess that's out of the question at the moment if we're generating RTF.
|
12-23-2006, 01:57 PM | #23 |
Junior Member
Posts: 5
Karma: 10
Join Date: Dec 2006
Device: Sony PRS-500
|
Added a script...
Here is a script adapted from FangornUK's nice scripts (OK, shamelessly copied.) It is for HTML's that are already split up and need no additional processing.
In order to use it, you must create a file list (dir > filelist.txt, then edit text file to clean it up into a clean list.) The script reads the file list and calls HTML2LRF.exe with the list of files. It's easier than manually typing in a super-long list of files on the command line (30, 40, 50 chapters...) Each HTML file is listed in a working TOC as well (HTML2LRF does this automagically.) You can also specify multiple tags for Chapter titles (for example, I had a book that used H3 for a tag along the lines of "Chapter 1" and then an H4 tag with the actual name of the chapter -- something like "Fastidious Incompetence." ) Anyway, hope it helps -- feel free to modify. A few of notes: * Needs to be run in the directory of HTML2LRF.EXE * You can either specify a base directory with -d option -or- specify complete path-names for the files in the filelist.txt file. HTML2LRF needs to get complete path names or it doesn't work. * If no chapter tags are specified, then the <title> <\title> tag will be used by HTML2LRF for the TOC entry (which is fine sometimes) otherwise, the script will replace <title> <\title> with the contents of the specified tags ... unless it can't find them, in which case it will leave the title unchanged. To Do: * Allow chapter headings by regular expression instead of HTML tags (i.e. "<p>Chapter something<\p>" or ALL CAPS) * Pull in text files, split on regexp, remove arbitrary line-breaks, and convert to rudimentary HTML before combining into a BBeB (just to get that nice text-flow and TOC) -G |
03-28-2007, 08:05 AM | #24 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
New fix "Handle Gutenberg ZIPs with missing subdirectories".
I still use gutlrf even though, Book Designer is a much better solution, to prepare a Gutenberg HTML book for Book Designer. It (gutlrf.pl) will retrieve, extract, and clean up the HTML book (like removing page numbers) which can then be loaded into Book Designer to generate a LRF. |
03-30-2007, 11:04 AM | #25 |
books & doughnuts
Posts: 882
Karma: 37857
Join Date: Jan 2007
Location: usa
Device: sony reader, kindle2
|
Great stuff but you're forcing me to learn Perl!
|
03-30-2007, 01:44 PM | #26 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
You don't need to learn it, just run it!
|
03-30-2007, 05:55 PM | #27 |
Junior Member
Posts: 2
Karma: 10
Join Date: Feb 2007
Device: PRS-500/EBR1000
|
Huge BOOKS
FangornUK, this tool is awesome, I love it.
My only hope is that there is a way to send huge files into HTML2LRF. I want to put a dictionary on my Reader even if I need to use it the old fashioned way(or add bookmarks to it). The Gutenberg offers the Unabridged Websters but it is over 1 MB per volume or 15 MB in total. This crashes HTML2LRF everytime, any ideas would be great! |
03-31-2007, 04:48 AM | #28 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
The Websters appears to be in some strange XML format and is not HTML.
|
05-15-2007, 07:32 AM | #29 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Update:
|
05-15-2007, 10:20 AM | #30 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Nice you should see about getting manybooks.net to use this for the LRFs.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTML from Project Gutenberg? | Rcartes | Sony Reader | 10 | 04-21-2009 07:26 PM |
html to bbeb converter ? | bugsbunny14 | Sony Reader | 10 | 11-07-2008 10:50 PM |
Book Processor - Anything to LRF and HTML converter | LittleDragon | Sony Reader | 11 | 05-13-2008 04:31 PM |
JafSoft AscToRTF - A GREAT Gutenberg Book/Ascii/RTF converter | Prince Bertram | Sony Reader | 11 | 11-25-2006 06:29 AM |
Mazarin - Gutenberg in HTML | Alexander Turcic | Deals and Resources (No Self-Promotion or Affiliate Links) | 0 | 05-25-2004 03:11 AM |