Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 12-31-2009, 06:12 PM   #1
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
(X)HTML Metadata

What metadata does Calibre support for (X)HTML input? Is there a setting somewhere? I couldn't find anything in the User Manual, but I may have missed it. My experiment with importing XHTML wasn't too effective.

It seems to be able to output such data well, though.

We're having a discussion of Sigil's support for Dublin Core metadata in the <head> of (X)HTML docs used as input sources here. Near the bottom is a list of possible DCTERMS that cover most useful metadata as they relate to book collections, and some outliers. There is a specific list of what Sigil currently supports here.

Zipped, single-file XHTML is my storage format of choice, as it should convert easily to pretty much anything. I also edit a lot of files into decently marked-up XHTML. It'd be nice to have Calibre recognize the info.

Thanks!

m a r
rogue_ronin is offline   Reply With Quote
Old 12-31-2009, 06:37 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,468
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
IIRC the HTML metadata reader is optimized for the output of the ereader2html script as that is the most common use case.
kovidgoyal is online now   Reply With Quote
Old 12-31-2009, 07:57 PM   #3
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Really?

I just imported 180 or so Doc Savage books from Blackmask, using the "Add books from directories, including sub-directories" option of Calibre. Did a fine job, but the title could only be grabbed from the filename. I deleted them, then wrote a bash script to rename all the files from the folder names (which had the full titles) and re-imported. Then a bulk metadata edit took care of most of the rest.

So, not terrible. And these books don't have the best choices of metadata or particularly stringent values eg:

Code:
<TITLE>THE MAN OF BRONZE</TITLE>
<META NAME="Author" CONTENT="A Doc Savage Adventure by Kenneth Robeson">
<META NAME="Description" CONTENT="Mystery, Suspense, History, Gothic, Literature, Books, Arts">
and the original filename was manbronze.htm. Still, it might be useful, especially in the future as things might get better tagged -- if apps support it.

Sort of a chicken-and-egg thing.

Thanks,

m a r
rogue_ronin is offline   Reply With Quote
Old 12-31-2009, 10:24 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,468
Karma: 5383257
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Hmmm maybe html metadata reading is broken, opena ticket and I'll take a look at it when I have some ntime.
kovidgoyal is online now   Reply With Quote
Old 12-31-2009, 11:01 PM   #5
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Sure thing.

Thanks,

m a r

EDIT: Done!

Last edited by rogue_ronin; 12-31-2009 at 11:31 PM.
rogue_ronin is offline   Reply With Quote
Reply

Tags
dublin core, html, metadata, metadata import, xhtml

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HTML Metadata drsassafras Calibre 10 09-17-2010 03:56 PM
Recognition of author and title from html files/reading metadata from a seperate file Lethe Calibre 5 04-03-2010 09:35 AM
"metadata" (toc) in HTML documents pedz Calibre 8 03-30-2010 10:23 PM
metadata out of Html horseman Calibre 0 08-04-2009 09:34 AM
Wide margins in html to epub; font size mngmt; PDF metadata dementrio Calibre 2 08-01-2009 02:33 AM


All times are GMT -4. The time now is 01:23 AM.


MobileRead.com is a privately owned, operated and funded community.