Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-15-2008, 11:25 AM   #721
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
html2lrf does read metadata from HTML files. IIRC the current code is optimized to recognize the metadata generated by the ereader2html script.
kovidgoyal is offline   Reply With Quote
Old 08-15-2008, 01:57 PM   #722
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by llasram
The OPF spec itself includes a fair number of examples.
Thanks. For some reason my searches only returned other OPFs, not this one.

Still, the specification seems to have a whole lot of content which is not relevant to html2lrf, and others that seem to be missing (e.g. author-sort). Can anyone (kovidgoyal?) shed light on what is used and what is not?

Quote:
Originally Posted by kovidgoyal
html2lrf does read metadata from HTML files.
Heh, it was that dismissed ticket of mine about reading metadata from HTML files which caused my mistake.

Quote:
Originally Posted by kovidgoyal
IIRC the current code is optimized to recognize the metadata generated by the ereader2html script.
Is there a newer version of ereader2html than 0.03? That one doesn't seem to save any metadata.

Anyway, do you think that maybe specific html2lrf metadata might be useful? Metadata in the form <meta name="lrf-prefix:commandline-parameter-name" value="commandline-parameter-value"> should be easy enough to implement - it would simply reuse the code from commandline parser.
pepak is offline   Reply With Quote
Old 08-15-2008, 02:19 PM   #723
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Here's the current code to extract metadata from HTML files (it looks for metadata in comment sections:


Code:
def get_metadata(stream):
    src = stream.read()
    
    # Title
    title = None
    pat = re.compile(r'<!--.*?TITLE=(?P<q>[\'"])(.+)(?P=q).*?-->', re.DOTALL)
    match = pat.search(src)
    if match:
        title = match.group(2)
    else:
        pat = re.compile('<title>([^<>]+?)</title>', re.IGNORECASE)
        match = pat.search(src)
        if match:
            title = match.group(1)
        
    # Author
    author = None
    pat = re.compile(r'<!--.*?AUTHOR=(?P<q>[\'"])(.+)(?P=q).*?-->', re.DOTALL)
    match = pat.search(src)
    if match:
        author = match.group(2).replace(',', ';')
        
    mi = MetaInformation(title, [author] if author else None)
    
    # Publisher
    pat = re.compile(r'<!--.*?PUBLISHER=(?P<q>[\'"])(.+)(?P=q).*?-->', re.DOTALL)
    match = pat.search(src)
    if match:
        mi.publisher = match.group(2)
        
    # ISBN
    pat = re.compile(r'<!--.*?ISBN=[\'"]([^"\']+)[\'"].*?-->', re.DOTALL)
    match = pat.search(src)
    if match:
        isbn = match.group(1)
        mi.isbn = re.sub(r'[^0-9xX]', '', isbn)
        
    return mi
I dont think adding support for lrf specific metadata is worthwhile, but adding support for reading more generic kinds of metadata (basically extending the above code, is easy enough to do).

You can get a good idea of what kinds of metadata from OPF calibre supports by using the GUI to save an ebook. The GUI willc reate an OPF file with entries for all the metadata it knows about.
kovidgoyal is offline   Reply With Quote
Old 08-15-2008, 02:52 PM   #724
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by pepak View Post
Still, the specification seems to have a whole lot of content which is not relevant to html2lrf, and others that seem to be missing (e.g. author-sort). Can anyone (kovidgoyal?) shed light on what is used and what is not?
The "author-sort" is taken from the OPF "file-as" attribute on the Dublin Core <creator/> of OPF "role" "aut". Obvious, itnit?
llasram is offline   Reply With Quote
Old 08-16-2008, 03:27 AM   #725
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Thanks to both of you.

Another question: I bought a book scanner and have been using it to convert my paper books into ebooks in HTML format (because I consider it the best in regards of current and future functionality). I noticed several strange things when converting them into LRF using HTML2LRF on Windows and using that LRF on my Sony Reader PRS-505. Please note: I do not use GUI - I convert from command line and then copy the LRF to the Reader using file management utility.

1) "author-sort" doesn't seem to have any effect. I use command line such as
Code:
--author="Steve Perry" --author-sort="PERRY STEVE"
but in the books-by-author the book gets sorted among "S", not among "P".

2) I just can't understand chapter detection and TOC generation: I use <h2> tag for marking chapters, as in
Code:
<h2 id="contents">Table of Contents</h2>
<h2 id="chapter-10">The Attack</h2>
(Note: The id="contents" in the example refers to a hand-crafted TOC for the HTML file, which I will call html-toc further on. My problem relates to the TOC as displayed by the reader, which I will call lrf-toc.)

The command line is:
Code:
--chapter-regex=^
(this is real ^; I had to prepend it by another ^ for use in batch files)

I took it to mean that ANY h[1-6] tag would be considered a new chapter. Curiously enough, in my example above <h2 id="chapter-10"> gets detected as a chapter but <h2 id="contents"> does not. I thought maybe the regexp didn't get used so as an experiment, I renamed that chapter-10 to xxxpter-10, expecting it not to appear in lrf-toc. Strangely enough, it DID get detected. Only that <h2 id="contents"> seems to be ignored.

3) Another problem with chapter detection: I have a book which has 10 chapters and a whole lot of footnotes. I used a <ol> list at the end of the document to store all notes:
Code:
<ol id="notes">
  <li>
    <p id="note-1">Footnote 1</p>
  </li>
  <li>
    <p id="note-2">Footnote 2</p>
  </li>
</ol>
and the command line:
Code:
--force-page-break-before-tag="h2|p id="
(because if I don't use page breaks, the links just won't work correctly in LRF; in case you wonder why I used <p id="..." instead of the sematically better <li id="...">, it's because in the latter case the links won't work correctly even with page breaks).

Two strange things happen:
(i) All footnotes get recognized as chapters (!), so I get some 90 chapters instead of 10 in the lrf-toc.
(ii) Despite the force-page-break, there are as many footnotes per page as can fit (!) and still the links work correctly in the LRF (!!!). I don't complain about it, this result is actually very useful, but I find it strange that with <h2> chapters I need to keep each at the start of its own page to make it work but with <li><p> I can have many on the same page and still they work.

Are these expected behaviors due to some property of LRF which I am not familiar with or are these bugs and I should create a new ticket for them? (In that case, is it possible to send the demo file privately? I do not want to infringe on someone's copyright by posting a book into a public section)
pepak is offline   Reply With Quote
Old 08-16-2008, 03:50 AM   #726
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Forgot:

4) Paragraphs in <blockquote> have a much larger padding between them than normal paragraphs.

5) Paragraphs in <blockquote> can't be centered using class styles.
pepak is offline   Reply With Quote
Old 08-16-2008, 08:15 AM   #727
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by pepak View Post
Another question: I bought a book scanner and have been using it to convert my paper books into ebooks in HTML format (because I consider it the best in regards of current and future functionality).
HTML – excellent choice. I would actually recommend going the extra mile and saving your books as full EPUB books. Even if you don’t like the way ADE on the Reader renders EPUB, the additional metadata, external TOC, etc. in EPUB is an arguably better-in-the-first-place work-around for some of the issues below.

Quote:
Originally Posted by pepak View Post
1) "author-sort" doesn't seem to have any effect. I use command line such asn
Code:
--author="Steve Perry" --author-sort="PERRY STEVE"
but in the books-by-author the book gets sorted among "S", not among "P".
Hmm. That’s weird. I just tested with my firmware-updated 505 and it totally ignores the ‘Author.reading’ metadata. I vaguely remember it working before I updated the firmware, but I use the “Sort by Author” view so infrequently that I can’t be sure. This one looks like an upstream problem with Sony, although if purchased DRMed BBeB books sort correctly it may mean a community miscomprehension of the file format.

Quote:
Originally Posted by pepak View Post
2) I just can't understand chapter detection and TOC generation: I use <h2> tag for marking chapters, as in
Code:
<h2 id="contents">Table of Contents</h2>
<h2 id="chapter-10">The Attack</h2>
(Note: The id="contents" in the example refers to a hand-crafted TOC for the HTML file, which I will call html-toc further on. My problem relates to the TOC as displayed by the reader, which I will call lrf-toc.)

The command line is:
Code:
--chapter-regex=^
(this is real ^; I had to prepend it by another ^ for use in batch files)

I took it to mean that ANY h[1-6] tag would be considered a new chapter. Curiously enough, in my example above <h2 id="chapter-10"> gets detected as a chapter but <h2 id="contents"> does not. I thought maybe the regexp didn't get used so as an experiment, I renamed that chapter-10 to xxxpter-10, expecting it not to appear in lrf-toc. Strangely enough, it DID get detected. Only that <h2 id="contents"> seems to be ignored.
I’m not able to reproduce this one with a minimal example. Could you open a ticket with a file reproducing the error?

Quote:
Originally Posted by pepak View Post
3) Another problem with chapter detection: I have a book which has 10 chapters and a whole lot of footnotes. I used a <ol> list at the end of the document to store all notes:
Code:
<ol id="notes">
  <li>
    <p id="note-1">Footnote 1</p>
  </li>
  <li>
    <p id="note-2">Footnote 2</p>
  </li>
</ol>
and the command line:
Code:
--force-page-break-before-tag="h2|p id="
That regexp is only applied to the tag name, so the ‘p id=’ portion will never match.

Quote:
Originally Posted by pepak View Post
(because if I don't use page breaks, the links just won't work correctly in LRF; in case you wonder why I used <p id="..." instead of the sematically better <li id="...">, it's because in the latter case the links won't work correctly even with page breaks).
That sounds like a bug. If you can create a fairly minimal file reproducing the error, could you submit a ticket for that one too?

Quote:
Originally Posted by pepak View Post
Two strange things happen:
(i) All footnotes get recognized as chapters (!), so I get some 90 chapters instead of 10 in the lrf-toc.
Default behavior is to add all link-targets to the lrf-toc – see the option ‘--no-links-in-toc’.

Quote:
Originally Posted by pepak View Post
(ii) Despite the force-page-break, there are as many footnotes per page as can fit (!) and still the links work correctly in the LRF (!!!). I don't complain about it, this result is actually very useful, but I find it strange that with <h2> chapters I need to keep each at the start of its own page to make it work but with <li><p> I can have many on the same page and still they work.
If I understand this correctly, there are two issues going on here. First, that calibre’s chapter-detection co-joins “add this to the lrf-toc as a chapter” and “put a page-break at this point.” As an alternative to this, you can create an OPF file specifying an external NCX TOC (or HTML TOC). Calibre will generate an lrf-toc from that without inserting page-breaks. The second issue is the inconsistent way calibre finds link-targets, only paying attention to the ‘id’ attribute on a handful of tags – much obliged if you could open a ticket there too.

Quote:
Originally Posted by pepak View Post
Are these expected behaviors due to some property of LRF which I am not familiar with or are these bugs and I should create a new ticket for them? (In that case, is it possible to send the demo file privately? I do not want to infringe on someone's copyright by posting a book into a public section)
Well ideally for each ticket you would create a minimal HTML input file which re-creates the described error. Failing that, could you (perhaps with a script) replace all the text in your HTML file with “lorem ipsum” text? If not, then... Actually, if you e-mail me the file at llasram@gmail.com I’ll do the “lorem ipsum” replacement and send you back the resulting file for you to directly attach to the ticket(s)

Quote:
Originally Posted by pepak View Post
4) Paragraphs in <blockquote> have a much larger padding between them than normal paragraphs.

5) Paragraphs in <blockquote> can't be centered using class styles.
Those are known-but-annoying issues with calibre’s ad-hoc CSS parsing and rendering. With the Reader getting EPUB support LRF formatting issues are downgraded a bit, but that one bugs me too and if you open a ticket I’ll see if I can’t at least improve the situation.
llasram is offline   Reply With Quote
Old 08-16-2008, 09:06 AM   #728
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by llasram View Post
HTML – excellent choice. I would actually recommend going the extra mile and saving your books as full EPUB books.
I chose HTML because it is device-independent and easy to modify (I spend a lot of time formatting and spell-checking my e-books). I am not so sure whether I want to move to a format which is primarily intended for e-books. I'll do some research about EPUB format and see.

Quote:
Hmm. That’s weird. I just tested with my firmware-updated 505 and it totally ignores the ‘Author.reading’ metadata. I vaguely remember it working before I updated the firmware,
Me too. So it is apparently a bug in the firmware. No problem, I was just wondering if it is a bug in Calibre.

Quote:
I’m not able to reproduce this one with a minimal example. Could you open a ticket with a file reproducing the error?
Will open a ticket for all issues. I just wanted to make sure opening ticket is the right thing to do with these questions - my previous tickets were mostly discarded for various reasons.

Quote:
That regexp is only applied to the tag name, so the ‘p id=’ portion will never match.
I see. I think this would be a nice feature. Will create a ticket for it and see what happens.

Quote:
Default behavior is to add all link-targets to the lrf-toc – see the option ‘--no-links-in-toc’.
Thanks.

So maybe my TOC-creating issues are actually not a result of one specific header not getting detected but a result of NO header getting detected and instead creating TOC from the links - those links that I put in my HTML-TOC:
Code:
<h2 id="contents">
<ol>
  <li><a href="#chapter-1">Chapter 1</a></li>
  ...
</ol>
I will check this. It seems plausible to me.


Quote:
If I understand this correctly, there are two issues going on here. [...]
Actually, it was meant as an observation of a strange inconsistency:

A) I have a chapter in my e-book:
Code:
<h2 id="chapter-10">Chapter 10</h2>
<p>Something or whatever.</p>
This chapter appears in the LRF-TOC (either due to chapter detection or due to links being added, see above). But the behavior in the Reader differs depending on
--force-page-break-before-tag=h2
: If the option is used, LRF-TOC item works as expected. If the option is not used, LRF-TOC item actually links to one page before the chapter (if the chapter starts at page 123, link from LRF-TOC takes me to page 122). It is seemingly impossible to get two chapters on one page.

B) I have a footnote in my e-book in the semantically correct form:
Code:
<ol><li id="note-1"><p>Text for footnote 1</p></li></ol>
I couldn't get the links in the text to work correctly at all, no matter what options I tried. Maybe it got fixed in the newer versions of Calibre - when I found the workaround, I never bothered to try again.

C) I have a footnote in the form:
Code:
<ol><li><p id="note-1">Text for footnote 1</p></li></ol>
Multiple footnotes can be on one page and all links to them function correctly (take me to the page with the footnote) - compare it to A) where I need to put each chapter on a separate page.

I understand now why I never got a page break before the footnote, so that's not an issue anymore.

Quote:
Well ideally for each ticket you would create a minimal HTML input file which re-creates the described error.
The problem with this approach is that with earlier releases of Calibre I found that some of the errors only appear with the full book, not with a minimal example. I will try to do it, but I am afraid I might need to upload the whole book - maybe even with the original texts, to be certain.

The good news is that I can demonstrate all of these issues with one book :-)

I'll see about opening those tickets. Thanks for the answer.
pepak is offline   Reply With Quote
Old 08-16-2008, 12:06 PM   #729
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Note that the LRF format doesn't support inline links, so it's typically a good idea to porce either paragraph or page breaks before an inline link.

The handling of blockquote is deliberate. It gives the best results for "typical" usage of <blockquote>
kovidgoyal is offline   Reply With Quote
Old 08-17-2008, 07:48 AM   #730
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by pepak View Post
So maybe my TOC-creating issues are actually not a result of one specific header not getting detected but a result of NO header getting detected and instead creating TOC from the links.
Indeed. When I disabled table generation from links (--no-links-in-toc), no items appeared in LRF-TOC. The problem is that I just can't seem to generate ANY LRF-TOC when --no-links-in-toc is used.
Code:
<h2 id="chapter-2">Something or another</h2>
Code:
any2lrf.exe --force-page-break-before-tag=h2 demo.htm
I guess this is not worthy of a ticket - it's probably not a bug, just my misunderstanding of how chapter detection works. I would appreciate some pointers.

I have generated a demo HTML file for the rest of the issues and will create a new ticket shortly.
pepak is offline   Reply With Quote
Old 08-17-2008, 10:37 AM   #731
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need --add-chapters-to-toc
kovidgoyal is offline   Reply With Quote
Old 08-18-2008, 02:35 PM   #732
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by pepak View Post
I chose HTML because it is device-independent and easy to modify (I spend a lot of time formatting and spell-checking my e-books). I am not so sure whether I want to move to a format which is primarily intended for e-books. I'll do some research about EPUB format and see.
EPUB is basically just XHTML with separate XML metadata (OPF for metadata & multi-file content ordering and NCX for table of context) all bundled up in a ZIP file. It lets you use HTML for the content without sacrificing consistent metadata while still bundling everything nicely into one file.

Best of luck with your scanning!
llasram is offline   Reply With Quote
Old 08-23-2008, 03:39 AM   #733
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by kovidgoyal View Post
You need --add-chapters-to-toc
I am afraid I still can't get it to work:
Code:
...
<h2 id="chapter-1">Beginning</h2>
<p>Something or anything</p>
...
Code:
any2lrf --no-links-in-toc --force-page-break-before-tag="h2" --add-chapters-to-toc book.htm
No chapters appear in Table of Contents.
pepak is offline   Reply With Quote
Old 08-23-2008, 08:54 AM   #734
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Add
Code:
--chapter-attr h2,id,chapter
kovidgoyal is offline   Reply With Quote
Old 08-24-2008, 09:57 AM   #735
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by kovidgoyal View Post
Add
Code:
--chapter-attr h2,id,chapter
That lists each chapter twice for some reason.
pepak is offline   Reply With Quote
Reply

Tags
html2lrf, libprs500


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Change font of header for LRF Output on PRS 505 duckbill Calibre 3 05-15-2010 11:07 AM
Pissed off with LRF formatting: LRF/LRS clean tool? grimborg LRF 8 02-15-2010 01:14 PM
Fonts for LRF output krischik Calibre 1 10-03-2009 05:01 AM
CBZ > LRF (LRF>HTML/MOBI????) sideburnt Calibre 4 09-15-2009 06:44 AM
libprs500 Issues Converting .LIT to .LRF - .LRF crashes everything vasbinde Calibre 6 02-14-2008 12:16 PM


All times are GMT -4. The time now is 03:49 PM.


MobileRead.com is a privately owned, operated and funded community.