03-18-2013, 04:07 PM | #1 |
Fanatic
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
|
Making P.Clark library database
I believe the cardinal virtues of a programmer are held to be laziness, impatience, and hubris. I suppose according to those criteria I may consider myself qualified to that epithet; I cannot be bothered to search through the Patricia Clark Library for books because of the limited search criteria, I don't want to wait till the green-lettered masters improve things, and how hard can it be to make a half-decent database anyway?
So I've downloaded the current status of the library, and set to work (re)learning SQL. To make it at all probable I will achieve my goal, I've limited myself to mobi and epub, and try to extract the following information about the books from the posting head:
It seemed fitting to start with Patricia Clark's opus, the meta-data have been cleansed and had rudimentary QA, and are being imported and linked in the database. I use MySQL. ———————— Update: Just to prove I'm not bluffing, I attach a snapshot of the database. Still very pre-alpha and messy. Also attached: A forlorn hope that somebody who actually knows about databases will give it a glance. Most important lesson learnt so far: Stuffing information into a database is much easier than prying it out again. ––– 20.03.13:New snapshot. Shaping up. Included the output of the half-dozen most productive contributors. So all you have to do now to get a list of titles in German of authors whose name includes "Doyle" is to write Code:
select name from titles where id in (select title_id from books inner join book_author on book_author.book_id=books.id and book_author.author_id in (select id from authors where surname like '%Doyle') and books.language_id=(select id from languages where name='german')); ------ 24.03: New snapshot. All entries as of a week ago are included. I attach the text file I use as a starting point; it's spreadsheet-friendly. table books is the main table, one row per book, link_id refers to the id on the mobileread web site. titles are stored in the titles table, linked through books.title_id <-> titles.id. authors and books can have many-to-many relations (ditto languages), therefore the table book_author lists book_ids with corresponding author_ids, both fields non-unique. Last edited by SBT; 03-24-2013 at 06:26 PM. Reason: attached new db snapshot |
03-19-2013, 02:00 AM | #2 |
Obsessively Dedicated...
Posts: 3,200
Karma: 34977896
Join Date: May 2011
Location: JAPAN (US expatriate)
Device: Sony PRS-T2, ADE on PC
|
What an excellent idea! Karma to you!
If I knew anything about databases, I would volunteer to help, but I only speak Excel. I hope your project goes well. |
Advert | |
|
03-19-2013, 07:23 AM | #3 |
Wizard
Posts: 3,388
Karma: 14190103
Join Date: Jun 2009
Location: Berlin
Device: Cybook, iRex, PB, Onyx
|
I really don't want to stop your enthusiasm, but I'm wondering a little bit. I know the search function here is said to be bad (by the way, not my opinion), but having the information you've posted above (author, title, language...) I never ever had a problem finding any book here.
But please, go ahead, I really don't wanna spread negativism. |
03-19-2013, 10:00 AM | #4 |
Fanatic
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
|
Well, learning how to make a relational database is hopefully a reward in itself...
I agree that finding a specific book isn't that frustrating. My main gripe is the awkwardness of compiling a list of books according to specific criteria (e.g. illustrated epubs in german). But I see other potential benefits, too: Easy creation of OPDS-catalogs, making the naming of authors more consistent (how many different versions of Sir Arthur Conan Doyle are in the library now?), find which books are only available in epub/mobi versions, make some kind of popularity metric... Progress on the database is satisfactory, I've decided on the data relationships, so now I just have to get the script for database insertion working properly. |
03-19-2013, 03:20 PM | #5 |
Wizard
Posts: 3,388
Karma: 14190103
Join Date: Jun 2009
Location: Berlin
Device: Cybook, iRex, PB, Onyx
|
Ah, I see. Thanks for explaining more deeply. And good luck with your work!
|
Advert | |
|
04-04-2013, 11:03 AM | #6 |
Fanatic
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
|
Well, that seems to have been a minor waste of time...
Having fiddled with umpteen entries to ensure some reasonable quality in the database entries, I learn that the library is to be expurgated of all books that are not life+70yrs compliant... I'm putting the project on hold until that process is done with. I'm not really complaining; I have no more liking for being sued to smithereens than the next man, but I do insist on my right to be slightly self-centerdly grumpy and depressed about it... |
04-04-2013, 11:43 AM | #7 |
frumious Bandersnatch
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Even with only life+70 compliant books, I guess we'll have quite a few of them. And we need the database anyway. You may be contacted by Alex at some point so that, at least, the work you've already done is not lost.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
The Patricia Clark memorial library now has more than 25,000 eBooks | DaleDe | Announcements | 16 | 06-05-2013 10:33 AM |
Patricia Clark Memorial library filter request | derangedhermit | Feedback | 4 | 02-26-2012 09:03 AM |
Patricia Clark Memorial Library | Worldwalker | Feedback | 18 | 11-17-2010 07:21 AM |
Making a "library" database on Access | lilac_jive | Lounge | 0 | 12-22-2008 09:00 PM |