05-19-2009, 12:31 PM | #1 |
Member
Posts: 10
Karma: 10
Join Date: Nov 2007
Device: N800
|
Scripting with epub-meta
In the dark days before Calibre, I used Tellico to manage my ebook collection. Tellico is great for some things, but it wasn't really meant to be an ebook manager, and I would like to convert my massive epub collection over to Calibre. The problem is that I have spent many hours carefully organizing my collection, and the thought of redoing everything in Calibre is a rather daunting task. I have exported my Tellico collection as an XML file, and it has occurred to me that I might be able to use some kind of bash script together with epub-meta to write the Book title, author, and description to the actual file, which Calibre can then read. The problem is I know little about bash scripting, even less about xml, and I'm not sure if there's something else I would need to tie them all together. Does anyone by chance know of a script out there already capable of doing this, or something similar? If not, is there at least someone who wouldn't mind giving me a general idea of where to start?
|
05-20-2009, 01:20 AM | #2 |
Guru
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
|
Ok, I definitely am not the best person to answer...
But off the top of my head, you will need to know which file(s) make up the book database. You will also need to know the file structure (how the data is listed) in each of the affected files. Since XML is generally human-readable, you might be able to simply append the necessary data to the existing file(s) and save. Somehow, I doubt it is going to be this easy. |
Advert | |
|
05-20-2009, 06:55 AM | #3 |
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Bash scripting doesn't have the best support for xml parsing...
Depending on who the data is written to the file it could be either very difficult or very easy. Say the data looks like this: Code:
<book> <path>/home/you/books/author/title.ext</path> <title>Test Title</title> <author>Test Author</title> </book> <book> <path>/home/you/books/author/title2.ext</path> <title>Test Title</title> <author>Test Author</title> </book> |
05-20-2009, 10:37 AM | #4 |
Member
Posts: 10
Karma: 10
Join Date: Nov 2007
Device: N800
|
That is, in fact, pretty much how the file is arranged...here's an example:
Code:
<entry id="21" > <title>The Shadow of the Lion</title> <authors> <author>Lackey, Mercedes</author> <author>Flint, Eric</author> <author>Freer, Dave</author> </authors> <binding>Paperback</binding> <publisher>Baen</publisher> <pub_year>2003</pub_year> <isbn>0-7434-7147-4</isbn> <pages>936</pages> <languages> <language>English</language> </languages> <keywords> <keyword>Historical fiction</keyword> <keyword>Fantasy fiction</keyword> <keyword>16th century</keyword> <keyword>Science fiction</keyword> </keywords> <cover>2721667105d9edee31ffa87663fa0863.jpeg</cover> <comments>Description of the book</comments> <file>file:///share/fiction/Lackey,%20Mercedes/The%20Shadow%20of%20the%20Lion.txt</file> |
05-20-2009, 11:17 AM | #5 |
Banned
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
|
A scriptable text editor could do the job, too, if you're already familiar with one.
m a r |
Advert | |
|
05-20-2009, 11:42 AM | #6 |
Wizzard
Posts: 1,402
Karma: 2000000
Join Date: Nov 2007
Location: UK
Device: iPad 2, iPhone 6s, Kindle Voyage & Kindle PaperWhite
|
For the XML in Python, I've found BeautifulSoup to be easy to get started with.
('>>>' is the python command prompt when it's run interactively) e.g. Code:
>>> from BeautifulSoup import BeautifulStoneSoup >>> file = open('example.xml') # your example above with a </entry> appended >>> soup = BeautifulStoneSoup(file) >>> soup.title.string u'The Shadow of the Lion' >>> soup.findAll('author') [<author>Lackey, Mercedes</author>, <author>Flint, Eric</author>, <author>Freer, Dave</author>] >>> soup.findAll('author')[0].string u'Lackey, Mercedes' >>> keywords = soup.keywords >>> keywords.findAll('keyword') [<keyword>Historical fiction</keyword>, <keyword>Fantasy fiction</keyword>, <keyword>16th century</keyword>, <keyword>Science fiction</keyword>] >>> [k.string for k in keywords.findAll('keyword')] [u'Historical fiction', u'Fantasy fiction', u'16th century', u'Science fiction'] >>> |
05-20-2009, 12:15 PM | #7 |
Guru
Posts: 753
Karma: 1496807
Join Date: Jul 2008
Location: The Third World
Device: iLiad + PRS-505 + Kindle 3
|
Beautiful Soup is fantastic!
I'd couple it with SQLite to enter the data directly in calibre's metadata.db file. In simple steps: 1. Import all the files in calibre (without metadata, just the title, maybe in the filename) 2. Export all those data in CSV from SQLite 3. Enrich the CSV with python/beautifulSoup 4. Load the data in the database with SQLite |
05-20-2009, 12:33 PM | #8 |
creator of calibre
Posts: 44,336
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I would recommend using the calibredb command to add books with their metadata to the calibre database.
Use calibredb add calibredb set_metadata calibredb add_format |
05-20-2009, 05:45 PM | #9 |
Member
Posts: 10
Karma: 10
Join Date: Nov 2007
Device: N800
|
Thanks everyone for your help! I've actually gotten surprisingly far given the fact that this morning I knew absolutely nothing about python. Beautiful soup has made it really easy, but now that I've gotten as far as writing the data to the file, I ran into a little bit of an issue...When I tried to write the author to the file using epub-meta, the result was "Author: Lackey & Mercedes". I was completely confused as to how this was happening until I tried to do it manually and realized it does the same thing. Apparently it takes the input "Lackey, Mercedes" as specifying two different authors. Is there any way to change this, or should i take each author in the format "lastname, firstname" and convert it to "firstname lastname" before writing the metadata?
|
05-20-2009, 08:05 PM | #10 |
creator of calibre
Posts: 44,336
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
use firstname lastname
|
05-27-2009, 08:07 PM | #11 |
Member
Posts: 10
Karma: 10
Join Date: Nov 2007
Device: N800
|
Thank you all for your help! After a few hours studying I have the following (which I know, is horrible, but it works):
Code:
#import os for executing commands import os #import BeautifulSoup from BeautifulSoup import BeautifulStoneSoup #define file to use file = open('/home/michelle/ebookcollection.xml') #soup it soup = BeautifulStoneSoup(file) #read all entries into an array books for book in soup.fetch('entry'): #set book title bookTitle = book.title.string #set book file bookFile = book.file.string bookFile = bookFile.replace('%20', ' ') bookFile = bookFile.replace('%27','\'') #determine if book is part of series and set requisite values bookSeries="" bookSeriesIndex="" if book.series is not None: bookSeries=book.series.string if book.series_num is not None: bookSeriesIndex=book.series_num.string #if book has description, set that as well bookComment = "" if book.comments is not None: bookComment=book.comments.string bookAuthors = "" bookTags = "Unread, Tellico" for a in book.fetch('author'): thisAuthor = a.string if thisAuthor.find(",") > -1: firstName = thisAuthor[(thisAuthor.find(",") + 2):] lastName = thisAuthor[:thisAuthor.find(",")] thisAuthor = firstName + " " + lastName bookAuthors = bookAuthors + thisAuthor +", " bookAuthors = bookAuthors.strip(", ") print "Title: " + bookTitle print "Authors: " + bookAuthors print "File: " + bookFile print "Series: " + bookSeries print "Series Number: " + bookSeriesIndex print "Description: " + bookComment print "Tags: " + bookTags #write it out command = "epub-meta -t \"" + bookTitle + "\" -a \"" + bookAuthors + "\" --comment=\"" + bookComment + "\" --series=\"" + bookSeries + "\" --series-index=\"" + bookSeriesIndex +"\" --tags=\"" + bookTags + "\" \"" + bookFile + "\"" os.system(command) |
05-27-2009, 09:57 PM | #12 |
creator of calibre
Posts: 44,336
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Just wait a couple of weeks calibre 0.6 will have an ebook-meta utility that will allow you to set the isbn and the cover
|
05-27-2009, 09:58 PM | #13 |
Member
Posts: 10
Karma: 10
Join Date: Nov 2007
Device: N800
|
perfect! thank you!
|
05-27-2009, 11:11 PM | #14 |
hopeless n00b
Posts: 5,110
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
|
|
05-27-2009, 11:49 PM | #15 |
Guru
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
|
Oh my... weeks.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
ePub meta data | brudigia | ePub | 4 | 07-26-2010 12:58 PM |
Any command line to edit epub meta data? | bthoven | Calibre | 1 | 03-25-2010 07:26 AM |
Any command line to edit epub meta data? | bthoven | ePub | 2 | 03-25-2010 04:15 AM |
set meta data with ebook-meta and ebook-convert | krischik | Calibre | 6 | 01-19-2010 11:40 AM |
epub-meta tags | JeffElkins | Calibre | 2 | 10-17-2008 04:04 PM |