![]() |
#1 |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2013
Device: none
|
Database Fork
Hi,
I am in an early stages of planning to fork Calibre-eBooks Database. The first part is the redesign of database storage/organization. The way I envision it is that each instance of a book record, from "<author>\<Title>\" will be just some almost random numerical archived zip file <some number>.zip. Inside the file I will have the ebooks and other data files associated with it. The second part is the data itself. I am thinking of moving it to almost html style formatting. Like this <book> So, that it will be easier to append functionality in the future and will be easy to make it backwards and forward compatible just by ignoring unknown parts.<file>filename</file> <format> <format:1>pdf</format:1> <format:2>djvu</format:2> </format><title>Some Title</title> <author> </book><author:1>first middle last</author:1> <author:2>first middle last</author:2> </author>This will also allow for nesting tags and nesting other titles and for future expansion of functionality. |
![]() |
![]() |
![]() |
#2 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 250
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Bookeen Diva, Kobo Clara BW
|
Hm. I don't quite understand.
Are you trying to develop an alternate, drop-in replacement for the metadata.db + filesystem that Calibre uses for storage? |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2013
Device: none
|
To some extent yes. The reason is that with the correct redesign of filesystem, Calibre will be able to organize almost everything. Therefore, it will need a more robust metadata.db. In addition, the reason for archiving is that it will allow to create an easy way to transfer items between libraries without having to worry that you will import it wrong and will have to edit metadata again, as everything will be inside that archive.
|
![]() |
![]() |
![]() |
#4 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 250
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Bookeen Diva, Kobo Clara BW
|
Ah.
You do realize that with a segmented XML database (what you call "almost html style formatting") hidden away in .zip files, perfs will take pummelling not seen since Wile E. Coyote still tried to get himself a side serving of roasted roadrunner? See, changing the filesystem hierarchy is one thing. In the end, it's just strings. But getting away from an RDBMS? That is not, I repeat, _not_, something you want to do. |
![]() |
![]() |
![]() |
#5 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 250
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Bookeen Diva, Kobo Clara BW
|
I think he wants to make it the main (and only) database. Now I'm no DBA, but it has me very scared.
Because you see, devils_add, if your DB is scattered into thousands of little XML files inside thousands of .zip, then you'll have to open all of those .zip and read all of those XML files every time you want to do anything, like, say, list titles. If you want to _search_, it's even worse, because then you'll have to open it all up again, _then_ make cross-references for basically every single field of every single XML file. That's pure insanity. There's a reason DBMSs have been around since the '70. It's because it _works_. Now XML/OPF files have their use, but database queries ain't it. As someone who once had to convert an old, OLD flat-file-based DB to Access (which is still not a real database but less wrong), I beg you: spare yourself the pain. |
![]() |
![]() |
![]() |
#7 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
We already have xml backups saved with the book. As backups, which is where metadata xml belongs. Why on earth should the database be replaced to use this instead, purely for the purpose of fixing an imaginary problem?
What do you think databases were invented for anyway? |
![]() |
![]() |
![]() |
#8 |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2013
Device: none
|
Guys, you are forgetting about the DMG file extension in Apple. Where everything the program needs to run is inside that file (which is an archive). So, what I am proposing is to have just the met file associated with record and record itself inside the archive which will be added into the main database. The only time the archive is written to is when files are added or when metadata is changed, and all other times it is opened is to extract a needed file to read it or to send it to the device.
Therefore, the main database file will be outside as it is right now. Also, you can have the main database link to virtual libraries databases for different, incompatible formats. In addition, this will allow for creating a single database for everything, with different iteration on front-end. So that you can have all your collections managed by just one database. Sorry, for wordiness. Also, with the archive architecture, you can keep some pdf books broken by chapter, and combine them on the fly as requested (resources available), so you don't have to download the full book, but just the chapters you need. Last edited by devils_add; 12-17-2013 at 05:48 PM. |
![]() |
![]() |
![]() |
#9 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,645
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Calibre could be more general purpose in a user-friendly sense, if the user could define the labels for the Author/Book Title entities to whatever suits their purpose - e.g. Architect/Building; Software Package/Program; Director/Movie Name; Producer/Game etc. If one could also add a third entity into the hierarchy then that would probably be enough to cover 90% of potential uses. ================== I'm puzzled by what you mean by a 'more robust metadata.db'. I've been using calibre for 2-3 years, and I use it one way or another on most days of the week. It has never crashed, I've never had to rebuild a database nor have I ever had to reinstall calibre. I wish I could say the same for some other programs that I use - eg web browsers, editors, IDE's, photo and music library managers - even the file manager I use crashes at least once a week. The only time performance has been an issue, was related to a custom column based on a union of 4 other custom columns - each of which was a list of Names. I intuited at the time I did it that I was pushing the edge of the envelope, so I had a Plan B for what to do when the envelope tore. ==================== I'm also intrigued as to how you would envisage implementing a multi-user server based implementation of your schema on different server platforms. BR Addenda : @devils_add - I missed seeing your most recent post before I posted this, vagaries of phone interruptus ![]() Last edited by BetterRed; 12-17-2013 at 11:36 PM. Reason: addenda |
|
![]() |
![]() |
![]() |
#10 | |||||
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 250
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Bookeen Diva, Kobo Clara BW
|
Quote:
Quote:
Quote:
Quote:
That's not a database, that's a .tar backup. If that's not it, you really really need to do some kind of ASCII art of your file hierarchy, because right now I'm in the dark. I'm really sorry, but the problem here is not the words, it's the concept. Quote:
Also, the ressources you may (I say _may_) save with that system are utterly dwarfed by the ressources you'll use just to read your database. There's no two ways of doing database-driven file management on consumer hardware. There's only one. There's one system, made of an RDBMS on one side and a filesystem on the other. Some of those files, if they're text files (as opposed to binary files) can be compressed, but that's as far as it can go. I know that because I've already tried it all, ever since I've first discovered databases twenty years ago. Your system? I made one like that, more or less, when I was 16. At the time it was a VB4-based management system for the fanfiction I downloaded from R.A.A.C. (ah, those were the times...), and trying to make Eyrie Production's Undocumented Features into some sort of reading order with bookmarks, because even back then it was _huge_. It took me a few weeks before I scrapped the idea of text files-based DB and turned to Access (There was no SQLite in those far away times...). I had much better results. So, learn from the mistakes of an old (well, 36-years old) pro and use an RDBMS. That's why they're made for. They're good at it. |
|||||
![]() |
![]() |
![]() |
#11 | ||
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,892
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
Quote:
Quote:
Good Luck with your fork. |
||
![]() |
![]() |
![]() |
#12 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 265
Karma: 724240
Join Date: Aug 2013
Device: KyBook
|
I can see where the idea comes from. Using one big container file with it's own internal 'filesystem' was/is still used for a lot of games. and most of these games also got released on several platforms. So in that respect the idea is not that strange. The only thing that is different here, the dynamic nature of a library compared to the static environment of game resources. You'd have to go the direction of virtual hd files or something similar and let the current rdbms use the container file to write-to/read-from instead of trying to recreate the rdbms. But...like physical hd's, virtual file systems tend to get fragmented the same way, with the same side effects. Which means, reorganization is needed, which means needing at least as much free diskspace as the size of the container file, preferable double that.
It may look like a good idea, but it has one helluva drawback. If something, how small even, breaks in the container file, it's bye-bye- library. At least in the current situation, all books stay intact. Which means you probably want to maintain some kind of parity system for repairs if worst comes to worst. In the end, the risk some mishap occurring to a virtual file system is much higher than to a physical one. Files get damaged much more often than HD's Last edited by At_Libitum; 12-28-2013 at 10:19 PM. |
![]() |
![]() |
![]() |
#13 | |
Member
![]() Posts: 13
Karma: 10
Join Date: Sep 2013
Device: none
|
Quote:
The only thing I will have in a big file, will be the general database, so that I don't have to re-scan it again. However, even that might not be true, as I am planning for the database to be locate-able on different hard-drives, not just different folders. Therefore, each location will have a local backup database, which the main database will load from and reference to. |
|
![]() |
![]() |
![]() |
#14 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 250
Karma: 20386
Join Date: Sep 2010
Location: France
Device: Bookeen Diva, Kobo Clara BW
|
So, if I understand correctly, you have:
- One directory with as many .zip as you have books, - One .zip with a "general database". What's in the latter? |
![]() |
![]() |
![]() |
Tags |
database, developement |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Librerator - multi-format e-reader, fork of KPV | Kai771 | Kindle Developer's Corner | 433 | 05-25-2024 03:34 AM |
Free Book (Kindle) - The Tiny Fork Diet [UK] | koland | Deals and Resources (No Self-Promotion or Affiliate Links) | 0 | 12-20-2011 02:22 PM |
Walk softly and carry a big fork. | kennyc | Lounge | 6 | 07-15-2011 01:41 PM |
Calibre Database cp Kindle Database | mitch13 | Library Management | 1 | 05-22-2011 07:33 PM |