Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Devices

Notices

Reply
 
Thread Tools Search this Thread
Old 12-05-2010, 06:11 PM   #1
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Creating Kindle Collections from Calibre's Data

Attached is a configurable python script which creates/updates Kindle collections based on the data in Calibre. Simplest way to use it is:

0) Download and install Python 2.6 or 2.7 on your computer. Then extract the script from the attached zipfile and copy the .py file to the root of your kindle

1) Set up your tags/series/author metadata in Calbire as normal. Any tags which you want to become kindle collections should start with a dash (-) character

2) Set Calibre's "metadata management" setting to "automatic management" in Preferences->Send books to device

3) Attach your kindle and allow calibre to send its metadata to it

4) Run the script (with Python installed, you can run it, in Windows, just by double-clicking on the copy of the script in your Kindle drive's root folder)

5) Restart your kindle once it's finished (using Home->Menu->Settings->Menu->Restart)

Once it's restarted, the kindle will take a few minutes (depending on how many books/collections are on it) to sort out what goes where but once it's done all your new collections should be created and populated.


There are a lot of possible configuration options in the script, which are detailed in the enormous comment at the start of the .py file itself (you can open it in notepad to read it), which can limit the number of collections created.

For example, by default an Author will only have his/her own collection created if he has at least 4 books on your kindle, but values like those can be changed and the comments in the script explain what can be changed and how.


Note on the thread:

This topic has been discussed in several other threads (and other websites), most notably https://www.mobileread.com/forums/sho...06#post1253906 but I thought it should be made a bit more visible, hence this thread.

The attached script is based on various people's prior work, so thanks muchly to them. And apologies to Python developers for the probably poor quality of the code but I've never written anything in Python before (so if you want somebody to turn this into a Calibre plugin don't look at me).
Attached Files
File Type: zip CalibreKindleCollections.zip (9.4 KB, 891 views)
mornington is offline   Reply With Quote
Old 12-07-2010, 01:55 AM   #2
Mixx
Zealot
Mixx has a complete set of Star Wars action figures.Mixx has a complete set of Star Wars action figures.Mixx has a complete set of Star Wars action figures.Mixx has a complete set of Star Wars action figures.
 
Posts: 143
Karma: 387
Join Date: Sep 2010
Device: Kindle 3
Great, thanks a million, much appreciated!

Regards, Mixx
Mixx is offline   Reply With Quote
Old 12-10-2010, 03:49 PM   #3
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Singing and Dancing version of the script

Attached is the latest incarnation of the script, which is a lot more powerful(though mostly configurable via variables in the script itself, rather than command-line flags). For example, it can now be set (via the command line) to just sort the existing collections into alphabetical order (by setting lastAccess) or to just produce a report of the collections on the kindle.

The by-author collections creation is more flexible as well now, with the ability (via scripted vars, not command line opts, since this expansion was for my own use) to provide be more selective about which authors get their own collections by only counting books which aren't part of a collaborative series and/or which aren't anthologies of stories by multiple authors and bits and pieces like that. The details are in the script.

It can also auto-create by-genre "miscellany" collections by "intelligently" working out from tags and other metadata what to put in each one so they don't get too enormous (for example, this lets me label Perry Rhodan books as "Space Opera" in Calibre along with all the other space opera books/stories but to automatically omit Perry Rhodan books from the "Space Opera miscellany" collection when that gets created so that other such books are easier to find in that collection).

Anyways, not sure how much of that new code would be useful to other people but it's all in the attached script and pretty well commented (though set up for my tags, obviously), so I've attached it in case somebody finds it useful.

In the zipfile are also my standard ms-doc batch files, to give some idea of how command line flags can be combined usefully, along with my notes on the script.

Last edited by mornington; 12-17-2010 at 07:30 AM. Reason: New version of "singing and dancing" version of script is attached further down the thread
mornington is offline   Reply With Quote
Old 12-10-2010, 04:37 PM   #4
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Just a quick note on the singing-and-dancing script's miscellany-by-genre collections. Each one is defined in the script with rules which look like this:

createMiscellanyCollection("- Thrillers Miscellany",
"Thrillers",
"- Robert Ludlum|- Dexter (Jeff Lindsay)",
True,True,False)

This example is an instruction to create a collection called "- Thrillers Miscellany" which is to contain all books which are tagged "Thrillers" and which have not already been placed into either of the collections "- Robert Ludlum" or "- Dexter (Jeff Lindsay)" AND which have not previously been put into a by-author collection (the first "True") OR a by-tag collection (the second "True"). The "False" indicates that I don't care if the book's been put into a by-series collection (a True there would exclude any such books from this new collection as well).

Just thought I'd mention it, as there's no real explanatory comment in the script about these, particularly about the meaning of the three True/False values at the end of each definition (like I said, this expansion was something I wrote essentially as a bespoke script for my own use).
mornington is offline   Reply With Quote
Old 12-11-2010, 06:58 PM   #5
Gary_M_Mugford
Groupie
Gary_M_Mugford has a complete set of Star Wars action figures.Gary_M_Mugford has a complete set of Star Wars action figures.Gary_M_Mugford has a complete set of Star Wars action figures.
 
Gary_M_Mugford's Avatar
 
Posts: 180
Karma: 299
Join Date: Jul 2010
Location: Brampton ON
Device: Kobo, Kindle3
Mornington,

A few questions before I really dive into your script, if I may. [1] I have about 200-300 books on my Kindle3, while calibre has considerably more. Does the script know to ONLY handle books "On the Device"? [2] I have a very limited number of tags, rarely more than one tag per book. AND my biggest chunk of books are tagless, each being assumed to be Science Fiction. Does that 'management' style prevent your script from being useful? [3] Would it be possible to allow me to choose WHICH column the collections would be extracted from? i.e. I could create a column called CollSets and populate it with my collection sub-sets. [4] Along the same vein, there already exists in my collection, a column that I call Anticipation, which I fill with 0-9. A book I want to read, and read today, ranks a 0. Books I will eventually get around to when nothing else is available, ranks a 9. I would certainly be interested in the ability to create a collection based on Anticipation.

I'm not the world's greatest programmer, and the python language will be new to me, but if you give me some encouragement on the four questions, I will probably take the time to take a shot at it.

Thanks for all your efforts, GM
Gary_M_Mugford is offline   Reply With Quote
Old 12-11-2010, 09:54 PM   #6
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Quote:
Originally Posted by Gary_M_Mugford View Post
Mornington,

A few questions before I really dive into your script, if I may. [1] I have about 200-300 books on my Kindle3, while calibre has considerably more. Does the script know to ONLY handle books "On the Device"?
It works only with the metadata which calibre sends to the kindle, not with the main calibre database, so it can only "see" what's actually on the kindle.

Quote:
Originally Posted by Gary_M_Mugford View Post
[2] I have a very limited number of tags, rarely more than one tag per book. AND my biggest chunk of books are tagless, each being assumed to be Science Fiction. Does that 'management' style prevent your script from being useful?
The script can create collections based on tags, series and/or authors. Personally, I used maybe a dozen tags, most books having just one or two tags (usually, just a genre and maybe "Anthology" or "Short Story") and create collections based just on author and tags.

Simplest way to see would be to run the script on your Kindle and see what collections it comes up with, then maybe tweak your tags/series in calibre accordingly.

Tip: In the script, find this bit of code:

def saveCollections():
cf = open(COLLECTIONS,'wb')
json.dump(kindleC,cf)
cf.close()

and change it to:

def saveCollections():
# cf = open(COLLECTIONS,'wb')
# json.dump(kindleC,cf)
# cf.close()

(i.e., add a # symbol at the front of the first three lines below "def saveCollections")

If you do the above then you can run your script and it will output a list of collections (and their contents) which it *would have* created based on your current settings and metadata but it won't actually make any changes to the collections on your kindle. Once you're ready to make changes, just remove those three # symbols and re-run the script.

Quote:
Originally Posted by Gary_M_Mugford View Post
[3] Would it be possible to allow me to choose WHICH column the collections would be extracted from? i.e. I could create a column called CollSets and populate it with my collection sub-sets.
I did consider this at one point myself. It's doable, but I ended up just going with the tags as it seemed simpler to manage.

Quote:
Originally Posted by Gary_M_Mugford View Post
[4] Along the same vein, there already exists in my collection, a column that I call Anticipation, which I fill with 0-9. A book I want to read, and read today, ranks a 0. Books I will eventually get around to when nothing else is available, ranks a 9. I would certainly be interested in the ability to create a collection based on Anticipation.

You can always write the code. If you have a look at the calibre metadata file on your kindle (open it in a text editor) you'll see it's in json format and contains more metadata than this script currently uses. Including custom columns. So the metadata's accessible, you's just need to read and process it in the loadCalibre function in the script.
mornington is offline   Reply With Quote
Old 12-12-2010, 04:14 AM   #7
ebookrights
Banned
ebookrights began at the beginning.
 
Posts: 63
Karma: 12
Join Date: Nov 2010
Device: none
So to only make collections based on series, and nothing else, would I run something like this:

C:\Python27\python.exe "K:\CalibreKindleCollections.py" --notags --noauthors > CalibreKindleCollections_log.txt

Last edited by ebookrights; 12-12-2010 at 04:17 AM.
ebookrights is offline   Reply With Quote
Old 12-12-2010, 01:09 PM   #8
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Quote:
Originally Posted by ebookrights View Post
So to only make collections based on series, and nothing else, would I run something like this:

C:\Python27\python.exe "K:\CalibreKindleCollections.py" --notags --noauthors > CalibreKindleCollections_log.txt
Yup. The " > CalibreKindleCollections_log.txt" at the end isn't compulsory, though. That just sends the output to a text file for reading later, rather than displaying it on the screen.
mornington is offline   Reply With Quote
Old 12-15-2010, 06:59 PM   #9
Ashby
Junior Member
Ashby began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Dec 2010
Device: kindle 3
I ran the python script as outlined above on my kindle 3. It generated collections and sorted my books as it was supposed to, but now when I sort by "Most Recent First" the sort order appears in reverse order. The most recently read book shows up last. Did the script do this or is it something with my kindle? Thanks in advance!
Ashby is offline   Reply With Quote
Old 12-16-2010, 12:06 AM   #10
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Quote:
Originally Posted by Ashby View Post
I ran the python script as outlined above on my kindle 3. It generated collections and sorted my books as it was supposed to, but now when I sort by "Most Recent First" the sort order appears in reverse order. The most recently read book shows up last. Did the script do this or is it something with my kindle? Thanks in advance!
The script only sorts collections, not books. However, I have noticed that sometimes the kindle loses track of what the current date is, which can result in it putting the wrong date on books when you read them, with the effect that you describe (i.e. most recently accessed books appear last in the list, rather than first).

I've raised this with Amazon, and apparently the date/time isn't properly maintained by the kindle but instead is synced via "whispernet". Which is insane if you ask me: I've seen toothbrushes with built-in clocks which can happily track the date/time without having to connect with a website, so I see no reason why the Kindle should need to.

So: Nothing to do with the script, but a bug/"design flaw" in the Kindle's own code.

Maybe if enough people shout at Amazon about this then they might fix it in a future firmware update.
mornington is offline   Reply With Quote
Old 12-16-2010, 04:16 PM   #11
CWatkinsNash
IOC Chief Archivist
CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.CWatkinsNash ought to be getting tired of karma fortunes by now.
 
CWatkinsNash's Avatar
 
Posts: 3,950
Karma: 53868218
Join Date: Dec 2010
Location: Fruitland Park, FL, USA
Device: Meebook M7, Paperwhite 2021, Fire HD 8+, Fire HD 10+, Lenovo Tab P12
Quote:
I've raised this with Amazon, and apparently the date/time isn't properly maintained by the kindle but instead is synced via "whispernet".
I can confirm this, and it doesn't always get it right with Whispernet either. In fact, at one point my Kindle's clock went 6 hours into the future. It was only like that for a few minutes, but I discovered it because I was working with files on it at the time and saw the timestamps on them were not what they should be.
CWatkinsNash is offline   Reply With Quote
Old 12-16-2010, 10:40 PM   #12
Ashby
Junior Member
Ashby began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Dec 2010
Device: kindle 3
Thanks, I corrected the time and the sort order works properly again.
Ashby is offline   Reply With Quote
Old 12-17-2010, 03:52 AM   #13
ch4os
Junior Member
ch4os began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2010
Device: K3
Your app looks great, but there is a problem with non-ascii characters (for example polish), when i'm running CallibreKindeCollections withous any option i've got something like this:

Code:
ADDED 2 items to collection "- Fandorin - Boris Akunin"
ADDED 10 items to collection "- Steven Erikson"
ADDED 4 items to collection "- Unknown"
ADDED 16 items to collection "- Orson Scott Card"
Traceback (most recent call last):
  File "CalibreKindleCollections.py", line 902, in <module>
    createCollectionsFromCalibre()
  File "CalibreKindleCollections.py", line 354, in createCollectionsFromCalibre
    endAddBooks()
  File "CalibreKindleCollections.py", line 466, in endAddBooks
    print collDesc.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 38: ordinal not in range(128)
I've attached my metadata.calibre file (just change extension from txt)

And second error when i'm trying to cleanup collections

Code:
python CalibreKindleCollections.py --noupdatecollections --nomiscellanycollection --sortcollections
Code:
Retrieving list of all books on kindle. This may take a little while.

Checking for any books which are in collections but are no longer on the kindle.

Traceback (most recent call last):
  File "CalibreKindleCollections.py", line 910, in <module>
    cleanupCollections()
  File "CalibreKindleCollections.py", line 533, in cleanupCollections
    if asin != UserGuideAsin:
NameError: global name 'UserGuideAsin' is not defined
Attached Files
File Type: txt metadata.txt (89.5 KB, 469 views)
ch4os is offline   Reply With Quote
Old 12-17-2010, 04:33 AM   #14
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Quote:
Originally Posted by ch4os View Post
Your app looks great, but there is a problem with non-ascii characters (for example polish), when i'm running CallibreKindeCollections withous any option i've got something like this:

Code:
ADDED 2 items to collection "- Fandorin - Boris Akunin"
ADDED 10 items to collection "- Steven Erikson"
ADDED 4 items to collection "- Unknown"
ADDED 16 items to collection "- Orson Scott Card"
Traceback (most recent call last):
  File "CalibreKindleCollections.py", line 902, in <module>
    createCollectionsFromCalibre()
  File "CalibreKindleCollections.py", line 354, in createCollectionsFromCalibre
    endAddBooks()
  File "CalibreKindleCollections.py", line 466, in endAddBooks
    print collDesc.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 38: ordinal not in range(128)
I've attached my metadata.calibre file (just change extension from txt)

I've no idea how to fix this one, I'm afraid - it's clearly a character set encoding issue but I just don't know Python (as I mentioned, this script is the first time I've ever seen Python code, much less written any), so I don't know why it can't convert your unicode symbols to utf-8.

Maybe someone who knows Python can address this one?

Quote:
Originally Posted by ch4os View Post

And second error when i'm trying to cleanup collections

Code:
python CalibreKindleCollections.py --noupdatecollections --nomiscellanycollection --sortcollections
Code:
Retrieving list of all books on kindle. This may take a little while.

Checking for any books which are in collections but are no longer on the kindle.

Traceback (most recent call last):
  File "CalibreKindleCollections.py", line 910, in <module>
    cleanupCollections()
  File "CalibreKindleCollections.py", line 533, in cleanupCollections
    if asin != UserGuideAsin:
NameError: global name 'UserGuideAsin' is not defined
This is an easy one to resolve: Just add the line:

Code:
UserGuideAsin = ""
Somewhere near the top of the file (and outside any function definitions). The problem is that it's not found a Kindle user guide and there's no blank default set in the script for that value. My bad. But that'll fix it.


As a complete aside: I had a glance at your metadata file and noticed that you have a lot of azw files. As you're doubtless aware from the notes accompanying the script, it can't automatically put those files into collections for you.

At the very least, you should run the script with the "--nocleanupdeadfiles" command-line option, as otherwise it will remove any unidentified files from collections, which includes azw files (note - the books themselves will not be removed from the kindle, they just will not appear in collections).
mornington is offline   Reply With Quote
Old 12-17-2010, 05:09 AM   #15
mornington
Connoisseur
mornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enoughmornington will become famous soon enough
 
Posts: 63
Karma: 732
Join Date: Nov 2010
Device: Sony PRS-650
Script Update

I've attached a slightly modified version of the script.

This version does NOT perform the clean up of "dead" files and/or empty collections by default (but you can switch them back on with "--c" and "--e" command-line options, respectively), and it also allows you to switch on/off the report generation ("--nr" options). I've also attempted to make the error handling on character encoding problems more robust: If it encounters one then it'll show an error message but carry on processing the collections if it can.

These changes should make it a bit friendlier if you have a lot of azw files, or other files with embedded metadata, and/or some "unusual" characters in author names, series, tags or book titles.

Unfortunately, I can't test this script at the moment as my Kindle is busy indexing and I don't want to interrupt it. So can somebody run this and make sure that it works (i.e. that I've not made any typos anywhere in the code) and let me know?
Attached Files
File Type: zip CalibreKindleCollections.zip (17.5 KB, 518 views)
mornington is offline   Reply With Quote
Reply

Tags
calibre, collections, kindle, kindle 3


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PRS-300 creating collections terraskye Sony Reader 4 11-28-2010 10:01 AM
Collections data format in 2.5 firmware ngukho Kindle Developer's Corner 18 10-29-2010 11:19 PM
Backing Up Data (Collections) Cpl Punishment Amazon Kindle 14 10-27-2010 01:41 PM
Calibre and creating Kindle collections on the PC? guiyoforward Calibre 1 07-30-2010 02:11 AM
Creating collections within collections larlissm Sony Reader 2 10-04-2008 08:02 PM


All times are GMT -4. The time now is 08:08 PM.


MobileRead.com is a privately owned, operated and funded community.