Thanks to both of you.
Another question: I bought a book scanner and have been using it to convert my paper books into ebooks in HTML format (because I consider it the best in regards of current and future functionality). I noticed several strange things when converting them into LRF using HTML2LRF on Windows and using that LRF on my Sony Reader PRS-505. Please note: I do not use GUI - I convert from command line and then copy the LRF to the Reader using file management utility.
1) "author-sort" doesn't seem to have any effect. I use command line such as
Code:
--author="Steve Perry" --author-sort="PERRY STEVE"
but in the books-by-author the book gets sorted among "S", not among "P".
2) I just can't understand chapter detection and TOC generation: I use <h2> tag for marking chapters, as in
Code:
<h2 id="contents">Table of Contents</h2>
<h2 id="chapter-10">The Attack</h2>
(Note: The
id="contents" in the example refers to a hand-crafted TOC for the HTML file, which I will call html-toc further on. My problem relates to the TOC as displayed by the reader, which I will call lrf-toc.)
The command line is:
(this is real ^; I had to prepend it by another ^ for use in batch files)
I took it to mean that ANY h[1-6] tag would be considered a new chapter. Curiously enough, in my example above
<h2 id="chapter-10"> gets detected as a chapter but
<h2 id="contents"> does not. I thought maybe the regexp didn't get used so as an experiment, I renamed that
chapter-10 to
xxxpter-10, expecting it not to appear in lrf-toc. Strangely enough, it DID get detected. Only that
<h2 id="contents"> seems to be ignored.
3) Another problem with chapter detection: I have a book which has 10 chapters and a whole lot of footnotes. I used a <ol> list at the end of the document to store all notes:
Code:
<ol id="notes">
<li>
<p id="note-1">Footnote 1</p>
</li>
<li>
<p id="note-2">Footnote 2</p>
</li>
</ol>
and the command line:
Code:
--force-page-break-before-tag="h2|p id="
(because if I don't use page breaks, the links just won't work correctly in LRF; in case you wonder why I used <p id="..." instead of the sematically better <li id="...">, it's because in the latter case the links won't work correctly even with page breaks).
Two strange things happen:
(i) All footnotes get recognized as chapters (!), so I get some 90 chapters instead of 10 in the lrf-toc.
(ii) Despite the force-page-break, there are as many footnotes per page as can fit (!) and still the links work correctly in the LRF (!!!). I don't complain about it, this result is actually very useful, but I find it strange that with <h2> chapters I need to keep each at the start of its own page to make it work but with <li><p> I can have many on the same page and still they work.
Are these expected behaviors due to some property of LRF which I am not familiar with or are these bugs and I should create a new ticket for them? (In that case, is it possible to send the demo file privately? I do not want to infringe on someone's copyright by posting a book into a public section)