02-22-2012, 02:08 PM | #1 |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Kindle Source Catalog
I am creating a catalog of all the Amazon source bundles released as of 2/18/2012.
After nearly a decade since the prototype of this script was published in the ABS guide, I decided to publish it for general use. Those who wish to read a large Bash script can find the source here: http://hg.minimodding.com/repos/cats/shacat.hg/ "hg clone" will get you your own copy. I will post the actual catalog file(s) as soon as the script finishes fondling the Amazon gigabytes. May even describe the record format - not that anyone ever RTFM. |
02-22-2012, 04:28 PM | #2 |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Catalog Record layout
I find it hard to believe that I haven't written this up somewhere on the web already, oh well, here goes.
The raw records are one, single line, record per file. The first field is the sha1 sum of the file; The second field is the volume ID (in this case: "Kindle"); Which is followed by one or more fields delimited by "|"; The last field is the path and filename which was checksumed in the first field. The (variable number) of fields between the volume ID and the last field are "containers" (usually archives). Here a (short) example, with the single line broken for posting at "|" : The "poor man's query tool" for this database, grep: Code:
knoppix:cat$ grep 'udev/rules.d/60' Amazon_2012.02.18_sha1.cat | sort Code:
7748689817fdd946e73e31b703a40f421249646a Kindle| Code:
/kdx/Kindle_src_2.1.1_351050064.tar.gz|/gplrelease/udev-112.tar.bz2| which in turn contains another compressed archive which in turn contains this file (the one that was sha1sum'd): Code:
/udev-112/etc/udev/rules.d/60-persistent-input.rules All that person needs to do is invent a grep expression and ask. ;-) Note: My dream was to import these catalogs into an MySQL database, but I have been running this script for nearly ten years now and not 'gotten around to it' yet. The record format is such that it can be imported into OpenOffice and searched there as a spreadsheet based database. (and from there, OpenOffice could populate a for-real database). As I write, the script is approaching the 1 1/2 million record mark, still running. |
02-22-2012, 09:31 PM | #3 | ||
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Catalog Run Results
After:
Quote:
Quote:
There where five archives that did not get included in the catalog: Code:
expand_tar_bz2 /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults /taglib-1.5.tar.bz2 /taglib-1.5.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/taglib-1.5.tar.bz2\|/gplresults Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2 bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now /mnt/md4/Builds/Kindle/work/taglib-1.5.XWK/gplresults/taglib-1.5.tar.bz2: bzip2 compressed data, block size = 900k expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2 bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty expand_tar_bz2 /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults /DirectFB-1.2.0.tar.bz2 /DirectFB-1.2.0.XWK Kindle\|/kkbrd/Kindle_src_3.1_558700031.tar.gz\|/gplrelease/DirectFB-1.2.0.tar.bz2\|/gplresults Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2 bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Unexpected EOF in archive /bin/tar: Unexpected EOF in archive /bin/tar: Error is not recoverable: exiting now /mnt/md4/Builds/Kindle/work/DirectFB-1.2.0.XWK/gplresults/DirectFB-1.2.0.tar.bz2: bzip2 compressed data, block size = 900k expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2_572340009.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2 bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty expand_tar_bz2 /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults /libltdl.tar.bz2 /libltdl.XWK Kindle\|/kkbrd/Kindle_src_3.2.1_576290015.tar.gz\|/gplrelease/libltdl.tar.bz2\|/gplresults Unable to extract compressed tar: /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2 bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. /bin/tar: Child returned status 2 /bin/tar: Error is not recoverable: exiting now /mnt/md4/Builds/Kindle/work/libltdl.XWK/gplresults/libltdl.tar.bz2: empty Five errors in 2 1/2 million files - I can live with that. At least it qualifies as better than a WAFG as to what was released. |
||
02-22-2012, 10:46 PM | #4 |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Raw Catalog
Here is where to grab a copy of the original, raw, catalog:
http://drpbox.knetconnect.com/cats/A..._sha1.cat.lzma If there is a MySQL addict in the group that would like to help putting these records into a decent, normalized, database - PM me. |
02-23-2012, 09:51 AM | #5 |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Actual, worked example query and result
Here is a for-real query made to answer a development question, the results are in the post attachment, showing for-real record examples:
https://www.mobileread.com/forums/sho...0&postcount=10 |
02-23-2012, 10:30 AM | #6 | |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
OpenOffice Catalog Examples
Quote:
Pick any entry on that index page for an example of the mirrored source bundles. Click any *.ods entry and it should just open in your spreadsheet application. Most of those bundles have both a short, summary catalog and the full catalog. All of them a lot smaller than the 2 1/2 million record Amazon catalog. Last edited by knc1; 02-23-2012 at 10:46 AM. |
|
02-23-2012, 01:09 PM | #7 | |
hub
Posts: 715
Karma: 2151032
Join Date: Jan 2012
Location: Iranian in Canada
Device: K3G, DXG, Kobo mini
|
Quote:
first non-knc1 poster here! Can you please explain what this is all about, what is the purpose and use of it (like a very hands-on intro), all I perceive from it that it seems interesting but in all honesty I have close-to-0 clue what it is at all!!! Thanks dude. |
|
02-23-2012, 01:32 PM | #8 | |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Quote:
You ask a good question - the "what is it good for" is not exactly obvious. I will continue to post links to worked examples so people can get a feel for what/when/why to use it. When trying to develop something that works across all models/system-versions - One of the first things a developer needs to know is if support for that feature was present in the orginal source code used by the vendor. (Which does not mean it was included in any particular build, but if it isn't there to start with...) The currently linked to example uses in posts show - There where two versions of udev (the hotplug notification system used in the builds) used in the system. Now the developer knows (or can check for) if what they plan to do was supported by both versions. When working with the display (across models/system-versions) it can become important to know if the same eink driver was used in all the machines. (It wasn't) Now the developer knows what to check into to see if whatever they are doing will work with all versions of the driver. It has a lot of other uses, but like a baby, it is hard to see what it might be good for in the long run. |
|
02-23-2012, 01:41 PM | #9 | |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Quote:
Ever want to do "Declarative Programming"? Ever want to do it in Bash? Look at the script - it has a "Declarative Progamming" engine in it and this utitlity uses that to "solve" the problem of opening any combination of compressed files and archives. Like that baby, it "learns as it goes" (and also remembers to record how to clean up after itself). Don't be side-tracked by all the supporting functions - the entire program is three (3!) lines - just ignore the first 1200 lines or so of supporting functions, look at the bottom of the file. To which I added a couple of lines to give a nice dump of all the problems encountered. That is how it got into the ABS Guide back in 2003 (possibly dropped in newer versions). It was a "Declarative Programming" extension to my chapter on using Bash arrays. Last edited by knc1; 02-23-2012 at 01:51 PM. |
|
02-23-2012, 04:35 PM | #10 | |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Quote:
Code:
knoppix:cat$ grep '/gplresults/libltdl.tar.bz2' Amazon_2012.02.18_sha1.cat da39a3ee5e6b4b0d3255bfef95601890afd80709 Kindle|/kkbrd/Kindle_src_3.1_558700031.tar.gz|/gplrelease/libltdl.tar.bz2|/gplresults/libltdl.tar.bz2 da39a3ee5e6b4b0d3255bfef95601890afd80709 Kindle|/kkbrd/Kindle_src_3.2_572340009.tar.gz|/gplrelease/libltdl.tar.bz2|/gplresults/libltdl.tar.bz2 da39a3ee5e6b4b0d3255bfef95601890afd80709 Kindle|/kkbrd/Kindle_src_3.2.1_576290015.tar.gz|/gplrelease/libltdl.tar.bz2|/gplresults/libltdl.tar.bz2 Last edited by knc1; 02-23-2012 at 04:45 PM. |
|
02-24-2012, 10:44 AM | #11 | |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Quote:
Code:
root:gplresults$ ls -l total 0 -rw-r--r-- 1 61967 502 0 Feb 9 2011 libltdl.tar.bz2 |
|
02-24-2012, 11:37 AM | #12 |
Going Viral
Posts: 17,212
Karma: 18210809
Join Date: Feb 2012
Location: Central Texas
Device: No K1, PW2, KV, KOA
|
Recovered archive records
There where five archives not included in the main catalog due to errors.
Of those five, three where empty files (say "thanks, Amazon Q.A."). Of the other two, I was able to recover 2 of 3 bzip2 blocks and 5 of 6 bzip2 blocks. This suppliment to the main catalog is posted along side of the main catalog, here: http://drpbox.knetconnect.com/cats/ Keep in mind that if you don't find a file listed in these additional records, it does not mean it was never there, it may have been left behind on the floor of the Amazon Q.A. department. (In the main catalog - if you don't find it - it was never there - with the exception of these five archives.) |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Catalog locking up Kindle Touch | Cynnamin | Library Management | 8 | 03-12-2012 10:41 PM |
Mobi catalog for kindle | Noughty | Library Management | 4 | 02-10-2012 02:31 PM |
mobi catalog = periodical, ePub catalog = book | trekchick | Library Management | 7 | 03-12-2011 01:11 PM |
Kindle Catalog Application? | davidfv | Kindle Developer's Corner | 1 | 12-31-2010 04:45 PM |
Catalog causes Kindle to lock up | deb27 | Library Management | 6 | 12-01-2010 07:02 AM |