![]() |
#241 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony Reader
|
That's three or more files in a zip archive. I will make the change to the code for my own use, as I do not intend to mess with several files, but keep everything in one file as opposed to an archive. I may also add .gz pipe support, if I can be bothered. And a ~/.html2lrf file which can store defaults, since it's a pain to retype them all the time.
Besides, html2lrf doesn't accept such a zip file either, last I checked, so that doesn't help, really. That said, I want to thank you for making html2lrf. It's saved me a lot of time and is an excellent tool. If it wasn't, I wouldn't be bugging you about it. ![]() txt2lrf however doesn't work for me. It never finishes. =( |
![]() |
![]() |
![]() |
#242 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Well to each his own. If you're having problems with any of the tools, please open bug reports so I can fix them.
|
![]() |
![]() |
![]() |
#243 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Jul 2007
Device: Sony Reader
|
Yes, to each his own, and it's nice with tools which accept data in all manner of forms.
![]() Not sure what to write in a bug report, txt2lrf has never worked for me, it always just works forever using 99% cpu. lit2lrf and html2lrf work eminently though. Haven't found any manual, so not sure what I'm doing wrong, or what might be wrong with the input files; I just invoke txt2lrf with no parameters except the txt file, and it sits there running. |
![]() |
![]() |
![]() |
#244 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Well attach the txt file
![]() |
![]() |
![]() |
![]() |
#245 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 24870
Join Date: Oct 2006
Device: Sony PRS/505
|
I was just reading one of the files I had converted using html2lrf, and I noticed that it leaves a blank line between paragraphs.
This is ok for some books, but the book I converted from only used indent to signify a new paragraph, and I'd prefer it that way (yours uses indent and a blank line). Is there some way to change this behaviour? (Note, I'd still want an empty paragraph to appear as a blank line, but currently it appears as two blank lines) Now you see why I'm interested in some way to store per-book settings? Some books need a font embedded, others don't, some need autorotate off, some need this new paragraph format, some do better with a different border width, etc. As the number of options and ways to convert increase, a way to remember the options used on a particular book will become more important. If there's some way for me to add this into the OPF, I'd be happy to go that route, it works really well for the normal metadata, so we could essentially extend the <x-metadata> structure to include html2lrf settings. |
![]() |
![]() |
![]() |
#246 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 24870
Join Date: Oct 2006
Device: Sony PRS/505
|
Also, I noticed that while you read the OPF, you ignore the <spine> directive, which tells you in which order to render the html files. Now this is probably a good thing for books that use links to link to other files, but in the case of Baen books, it would be a good idea to use it, since then I woudn't have to extract the LIT, remove the _top htm (because, seriously, who needs a rendered table of contents when the reader provides one for you) and then convert. I could just feed it the LIT. (or in my perfect case, extract the lit, add the correct metadata to the html or OPF file, and then repack the LIT.)
Note that this is definately not a high priority for me, I'm perfectly willing to go on unpacking and modifying, it's just something I noticed when I was looking at the OPF format. |
![]() |
![]() |
![]() |
#247 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It shouldn't leave a blank line between paragraphs unless there's an extra <br /> or </p> tag. For e.g. there wont be a blank line converting the following html
<p>para one</p> <p>para two</p> Adding per book settings to the OPF file seems like a good way to go. Open a bug report and I'll get around to it eventually. It's going to have to wait till I write a proper OPF parser, which in turn is probably not going to happen till after 0.4.0. Also open a bug report for the <spine> and I'll add a --use-spine option. And finally open a bug report for the declaration lists so I dont forget. |
![]() |
![]() |
![]() |
#248 | ||
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 24870
Join Date: Oct 2006
Device: Sony PRS/505
|
Quote:
Code:
<p> <a id="p9" name="p9"> </a> </p> <p onmouseover="PNo(9)">"Don't just stand there like a whore at a wedding, Master Holderman! Trim that foresheet! It's slacker than those idlers you call seamen!" </p> <p> <a id="p10" name="p10"> </a> </p> Quote:
|
||
![]() |
![]() |
![]() |
#249 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No it wont, it basically removes some extra page-breaks by running a couple of regexps over the HTML before processing it. If you can come up with a regexp that matches this case and doesn't affect anything else, I could add it.
|
![]() |
![]() |
![]() |
#250 | |
curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,487
Karma: 5748190
Join Date: Jun 2006
Location: Redwood City, CA USA
Device: Kobo Aura HD, (ex)nook, (ex)PRS-700, (ex)PRS-500
|
Quote:
What I've done is write a few shell scripts which I then run in a terminal window on my Mac OS X box -- although they should work fine (with small edits) on any Unix-ish system. The first script takes a single input directory holding one Baen html eBook, and converts it into lrf, placing the output into the specified output directory. This script knows how to find the cover JPG, how to find the _toc file (which is what I use as the master input for the conversion), and also knows my favorite settings. The second script takes a directory containing N subdirs (each as above) plus a single 'mapping' file. The mapping file holds one line per book specifying the mapping from input-dir to output-dir (this is how I manage storing books in directories by author, rather than by Baen's release date). It invokes the first script for each book. The final script simply takes a list of directories suitable for input to the second, and does the obvious invocation. The upshot of all this is that when Kovid releases a new html2lrf that has a feature I care about, it's a single command line to re-convert all my Baen eBooks. I've been quick-and-dirty with the script building, so my file paths are built in rather than read from a config file (or whatever) and there's minimal error checking. I'm happy to share if anyone is interested. |
|
![]() |
![]() |
![]() |
#251 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Released 0.3.67 with support for definition lists and a fix for the handling of zip files.
|
![]() |
![]() |
![]() |
#252 | |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 24870
Join Date: Oct 2006
Device: Sony PRS/505
|
Quote:
![]() As to the Python regular expression, One I've found that only matches paragraphs containing only an <a id...></a> seems to work on the baen books I've tried it on. <p>\s*<a id.*?>\s*</a>\s*</p> |
|
![]() |
![]() |
![]() |
#253 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's coz I use libprs500 a lot myself and I want it to be as bug free as possible :-) I essentially use all you guys as free bug hunters as bug-hunting is something I'm extremely lazy about. And those two fixes were about 10 lines of code.
But aren't those id elements referred to by some links in the rest of the file? |
![]() |
![]() |
![]() |
#254 | |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 131
Karma: 24870
Join Date: Oct 2006
Device: Sony PRS/505
|
Quote:
Edit: Oh, you're asking if the paragraph indicators (which is what these are) are used by anything else? No. They're used in the "web" reading version to update the silly "index" box. (Check http://www.webscription.net/10.1125/...0671318470.htm and move your mouse down the page. You can type a number into the box and it'll jump to that paragraph.) The html in the LIT doesn't have the javascript to enable this. Note that the pure html versions don't have <p> surrounding the <a> elements, so they don't render, it's really only an issue with the files they include in their LIT versions (I suspect the OEB DTD requires the surrounding <p>). Last edited by bkilian; 07-10-2007 at 06:19 PM. |
|
![]() |
![]() |
![]() |
#255 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I meant aren't there <a href> elements that refer to that id? So that removing the id would make those links not work. THough I suppose I could just remove the <p> and keep the <a>
|
![]() |
![]() |
![]() |
Tags |
html2lrf, libprs500 |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Change font of header for LRF Output on PRS 505 | duckbill | Calibre | 3 | 05-15-2010 11:07 AM |
Pissed off with LRF formatting: LRF/LRS clean tool? | grimborg | LRF | 8 | 02-15-2010 01:14 PM |
Fonts for LRF output | krischik | Calibre | 1 | 10-03-2009 05:01 AM |
CBZ > LRF (LRF>HTML/MOBI????) | sideburnt | Calibre | 4 | 09-15-2009 06:44 AM |
libprs500 Issues Converting .LIT to .LRF - .LRF crashes everything | vasbinde | Calibre | 6 | 02-14-2008 12:16 PM |