Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 01-03-2010, 11:03 AM   #1
hakan42
Zealot
hakan42 is on a distinguished road
 
hakan42's Avatar
 
Posts: 136
Karma: 60
Join Date: Jul 2009
Location: Munich, Germany
Device: Nook Classic rooted; Galaxy S IV with Aldiko, other older devices
Adding ODPS client to calibre (Bug #1665, Download from Bookshelf/Stanza servers)

Hi,


I need to fetch epub's from multiple online shops on a weekly basis.

After hacking around for a short while on a perl client, I decided to give bug #1665 a try and integrate an odps client into calibre itself.

The way I envision it, the code would check the remote sites, comparing the uuid's from the stanza feed URLs, and fetch the remote book if not found in the local database.

Kovid: Would something break if I prefix the uuid column in the database with the shop name? e.g. "webscription:ef2b09d6-3ebe-4bcb-9e4d-596617edbdf1" instead of "ef2b09d6-3ebe-4bcb-9e4d-596617edbdf1" alone if the book is fetched from webscription.net ?

There are quite a few things to work out (e.g. where to store the per-shop authentication details, the URL's to check, and so on) so it might take a few weeks until something useful comes from this thread...


Regards,
Hakan
hakan42 is offline   Reply With Quote
Old 01-03-2010, 11:24 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Since UUIDs should be globally unique, why would you need to prefix them?

Are you planning to implement a browser for the remote collections, or just a client to download newly purchased books?
kovidgoyal is offline   Reply With Quote
Old 01-03-2010, 01:11 PM   #3
hakan42
Zealot
hakan42 is on a distinguished road
 
hakan42's Avatar
 
Posts: 136
Karma: 60
Join Date: Jul 2009
Location: Munich, Germany
Device: Nook Classic rooted; Galaxy S IV with Aldiko, other older devices
Quote:
Originally Posted by kovidgoyal View Post
Since UUIDs should be globally unique, why would you need to prefix them?
Yes, they should be globally unique. But then again, calibre itself used book id's as urn, until you fixed that a short while ago.

Webscription (baen ebooks) uses some kind internal id. The book 1632 uses the ID <id>urn:webscription:0671578499</id> (seems to be based on their SKU).
Beam ebooks uses <id>urn:beam-ebooks:titelnr:999936881</id> for the book Ein gerissener Kerl

Therefore, to avoid possible collosions, I wanted to use the part after urn: as the uuid, I would store "webscription:0671578499" as the uuid of the first book and "beam-ebooks:titelnr:999936881" as the uuid of the second one.

This would have the advantage that stanza clients would recognize the books as identical, whether they come from the "original" shop or from my local library.

Maybe I am abusing the uuid field and should rather be adding a "remote-id" field to the books table but I would rather avoid changing the schema if possible.

Quote:
Originally Posted by kovidgoyal View Post
Are you planning to implement a browser for the remote collections, or just a client to download newly purchased books?
For starters, just a client to download the new books.

I am not yet sure how I would avoid downloading all of the available books from a store without implementing a browser, but as a first step, I'd like to automatize my weekly download task

Regards,
Hakan
hakan42 is offline   Reply With Quote
Old 01-03-2010, 01:59 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Hmm, it's a tough choice. Currently as far as I know, UUIDs are used in only two places (the OPDS feeds generated by the calibre content server and to set the identifier for a book when converting it to EPUB). This would break the second, but not catastrophically. On the other hand, altering the schema to add a remote_id column wont break anything catastrophically, since the only code that would depend on its existence would be the downloading code.

Looking at it from another angle, do you really need to put this info into the database? If instead it were stored in a config file as a list of previously downloaded urns, it would server the purpose of avoiding re-downloads. You can use the XMLConfig class to achieve this conveniently, it will automatically serialize a python dict.

If you do intend to do a browser eventually, let me know as I'd like to abstract its design to support sources of books that dont export OPDS feeds, like MobileRead.
kovidgoyal is offline   Reply With Quote
Old 01-03-2010, 05:54 PM   #5
hakan42
Zealot
hakan42 is on a distinguished road
 
hakan42's Avatar
 
Posts: 136
Karma: 60
Join Date: Jul 2009
Location: Munich, Germany
Device: Nook Classic rooted; Galaxy S IV with Aldiko, other older devices
Quote:
Originally Posted by kovidgoyal View Post
Hmm, it's a tough choice. Currently as far as I know, UUIDs are used in only two places (the OPDS feeds generated by the calibre content server and to set the identifier for a book when converting it to EPUB). This would break the second, but not catastrophically. On the other hand, altering the schema to add a remote_id column wont break anything catastrophically, since the only code that would depend on its existence would be the downloading code.

Looking at it from another angle, do you really need to put this info into the database? If instead it were stored in a config file as a list of previously downloaded urns, it would server the purpose of avoiding re-downloads. You can use the XMLConfig class to achieve this conveniently, it will automatically serialize a python dict.
I really would like to identify a previously downloaded book with only the information in the database. Keeping another file somewhere is a sure recipe to have end users somehow lose it and afterwards to complain that calibre is broken because it re-downloads everything again and again... yadda yadda yadda... better keep everything in one place.

I think I saw somewhere in the code (aaah, in database2.py) how you update schemas. If you don't mind, I would add a remote_id column. This should minimize the impact on other code...

Quote:
Originally Posted by kovidgoyal View Post
If you do intend to do a browser eventually, let me know as I'd like to abstract its design to support sources of books that dont export OPDS feeds, like MobileRead.
Actually, I would very much like a browser, but my python skills are not up to date yet. The last time I built anything big in python was two years ago.

I think a browser would be most useful for webscription too. Thy expose only the epub files via ODPS, but you can download all the other formates from the HTML web pages. Call me a digital packrat, but I'd like to mirror all possible formats if available

Regards,
Hakan
hakan42 is offline   Reply With Quote
Old 01-03-2010, 06:05 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I'm cool with a remote_id column. Don't add it the META view though as it is held in RAM and so will raise memory consumption.
kovidgoyal is offline   Reply With Quote
Old 12-04-2010, 09:51 PM   #7
Korny
Junior Member
Korny began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: Kindle 3
This would be very useful - I'm trying to find a way to fetch the PragPub magazine: http://www.pragprog.com/magazines
- It's is available in .mobi format, but I can't see a way to download it automatically - unless Calibre gets some way to fetch from the opds feed (which is at http://pragprog.com/magazines.opds btw)
Korny is offline   Reply With Quote
Old 12-05-2010, 06:52 AM   #8
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kovidgoyal View Post
Hmm, it's a tough choice. Currently as far as I know, UUIDs are used in only two places (the OPDS feeds generated by the calibre content server and to set the identifier for a book when converting it to EPUB). This would break the second, but not catastrophically. On the other hand, altering the schema to add a remote_id column wont break anything catastrophically, since the only code that would depend on its existence would be the downloading code.
UUIDs are used for device/book matching. Changing them would not be a good thing.
chaley is offline   Reply With Quote
Old 12-05-2010, 07:08 AM   #9
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by hakan42 View Post
I think I saw somewhere in the code (aaah, in database2.py) how you update schemas. If you don't mind, I would add a remote_id column. This should minimize the impact on other code...
Another option that avoids a schema upgrade is to add a hidden custom column where you store the ID. There might be some programmatic interfaces to be added to make this convenient (add the col, set it hidden, and have the tag browser ignore it). Perhaps a better approach is to create a non-displayed text column type or to add a flag tells the GUI to ignore the column, which would have the advantage that any number of the columns can be added. One concern is that this 'remote_id' column is just the tip of the iceberg, which using a general solution would avoid.

If you add a column to the books table, you also should add the get_ and set_ methods to database2. You might also want to think about whether the column is multi-valued.

Finally, think about how this field will appear in the EPUB if the metadata in the book is updated. I don't know how to connect the field with OPF generation, but that is probably what you would want to do.
chaley is offline   Reply With Quote
Old 12-05-2010, 08:21 AM   #10
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by chaley View Post
UUIDs are used for device/book matching. Changing them would not be a good thing.
Don't you hate it when you get snookered into answering a year old thread?
DoctorOhh is offline   Reply With Quote
Old 12-05-2010, 09:24 AM   #11
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by dwanthny View Post
Don't you hate it when you get snookered into answering a year old thread?
Sigh...

I should learn to look at the dates. However, the old thread did bring up an interesting problem that I will take over to the plugin thread.
chaley is offline   Reply With Quote
Old 12-05-2010, 12:19 PM   #12
hakan42
Zealot
hakan42 is on a distinguished road
 
hakan42's Avatar
 
Posts: 136
Karma: 60
Join Date: Jul 2009
Location: Munich, Germany
Device: Nook Classic rooted; Galaxy S IV with Aldiko, other older devices
Quote:
Originally Posted by chaley View Post
Sigh...

I should learn to look at the dates. However, the old thread did bring up an interesting problem that I will take over to the plugin thread.


Actually, thank you for bringing new input to a thread I had started a year ago... The problem I had at that time might be solved in the meantime with custom columns but I did not have enough free time at hand to really attack the OPDS downloader code... Maybe christmas though, when I don't be at the office 16 hours every day

Regards,
Hakan
hakan42 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Metadata BUG adding books Giuseppe Chillem Calibre 3 10-08-2010 05:13 PM
iPad Stanza mail attachment bug Fotoman Apple Devices 4 06-17-2010 07:29 PM
Bug, calibre adding .htm ebooks as ZIP ? carpii Calibre 3 02-27-2010 04:58 PM
Stanza Crashes when trying to Download multiple books using Calibre tochill Calibre 7 08-20-2009 06:20 PM
multiple calibre-servers? troymc Calibre 3 08-03-2009 08:42 PM


All times are GMT -4. The time now is 06:05 PM.


MobileRead.com is a privately owned, operated and funded community.