View Single Post
Old 03-17-2016, 03:41 AM   #17
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,475
Karma: 8025702
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by DavidTC View Post
I am not sure that you have internalized that books in libraries synchronized by a cloud provider are almost certainly not "calibre books". Their internal metadata is not updated.

I do not understand what you mean by 'Their internal metadata is not updated.'. Any book that has been in Calibre has its internal metadata 'updated' at least once, when it was added. (Assuming Calibre supports the metadata, but, if not, all this is moot.) When it was added, at minimum, a UUID was put in, allowing an easy re-match to that book if discovered elsewhere.
No. Calibre does not update metadata in a book when that book is added to the calibre library. Nor does calibre update metadata in a book when the book's metadata is edited. So the usual case is that the book in calibre's library has the metadata it came with.

Quote:
That is what I meant when I said 'I would stop there'....if that UUID doesn't exist, the file has never been in Calibre. (A UUID that doesn't match anything also shouldn't be farther matched, especially since a likely setup is that it's from *another* Calibre library the user has.)
You are ignoring the use case where a person has multiple calibre libraries, perhaps on different computers, perhaps on the same computer used by different family members. These libraries are not synced with each other. Such a case was brought up a week or two ago in our beta group when discussing the advisability of using calibre db ids in default file name templates. In this case a book's UUID will be different depending on which library one connects to. The situation can arise as well when a person has "staging libraries", a common technique used to ensure that the metadata in the calibre library for a new book is correct.
Quote:
Anyway, stepping back from book matching a second, my point is that books can, indeed, have their correct metadata, and it is possible to know when that is.
The keyword is "can". The reality: there is no guarantee that a UUID is there or that if there it is the "right" one.
Quote:
It is also worth noting that many (most?) pirated book collections are generated using calibre. Sometimes the books *do* contain calibre metadata, but it is metadata for the pirate's library not the user's library.

Heh. I literally deleted a section on pirates in my last two posts addressing exactly that.

To start with, this situation is hard to get to. I suspect that pirates use 'Save to Disk' instead of 'Send to Device', and that's why I *didn't* suggest having 'Save to Disk' add that flag, despite that metadata being correct also.
In calibre, Save to Disk and Send to Device use the same code to update metadata in the book. Also, I have seen indications that some libraries are copies of the calibre library with the metadata.*files deleted. Thus the book can a) have no calibre metadata, b) metadata from where the pirate pirated the book, c) the pirate's metadata, or d) the final user's metadata.

---

I am considering adding to CC the (explicit) ability to identify all books in a cloud library but not on the device and add those books to CC. That is a use case that is both supportable (non probabilistic) and has been requested by users other than you. The "update my books" case is handled by the wireless device connection.

Note that you can already "fetch all books not on device" in CC's cloud connection by tapping "Newest" and then "Download All". CC queues all the books, skips books that are already on the device (a book has a matching UUID), and complains if a book doesn't have a usable format. This method has two problems. 1) It isn't obvious that it can be used for this purpose. I don't think you have twigged to it and I didn't remember to mention it. 2) It queues all the books even if the books cannot be downloaded (no acceptable format) or are already downloaded. In the second case the processing to determine whether a download is necessary happens while the queue is being processed.

To fix both 1 and 2 I need both to make the operation explicit and to do the processing in advance to determine which books to download. The best way to do this is some set math that I described earlier. I need to verify that the set math will run on a phone when the user has a huge library (at least 20,000 books). If it does then CC already contains most of the rest of what is needed, specifically the cloud download queue.
chaley is offline   Reply With Quote