Getting around DRM, encoding?

kyzcreig · 06-19-2015, 11:03 PM

DRM is a real nuisance for us paying customers. I like to curate my notes, and usually do so after reading a great book. So you can imagine my surprise when I realized 90% of my annotations had been ignored.

Fortunately the annotations are still visible in the Kindle and the location data is in tact in my clippings.txt file. This gave me the idea of taking the location information for each annotation and then extracting the appropriate text from the original mobi file via a script. My understanding is that location corresponds to 128 bytes of data, so it should be straight forward to put all this information into a file. But I'm not sure how it's encoded and when I use something like UTF it's a half garbled mess.

I'm novice programmer though so I'm wondering:

A) if this is actually feasible
B) how hard it will be to decode mid-book excerpts

As for the DRM itself, I've found tools for stripping it but I'm not sure if that will corrupt the location information. From what I can tell it doesn't.

Notjohn · 06-20-2015, 06:45 AM

When I want to retrieve my highlights (and comments, though I rarely comment), I just Google kindle highlights amazon and am wafted to the appropriate website. Then I go copy > paste in my choice of word processors, often Note Tab Pro.

Those highlights seem to stay there forever, even for library books whose loan period has expired. There's a lot of excess verbiage (Delete This Highlight / Read More At Location N) but that's easily ignored or stripped out.

susan_cassidy · 06-23-2015, 04:57 PM

You don't have to extract the text from the mobi file, the text from an annotation is already in the "My Clippings" file. You can look at the code in the DRM-stripping script to find out how to decrypt the DRM'ed file.

Notjohn is talking about kindle.amazon.com.

kyzcreig · 06-25-2015, 03:17 AM

Quote:

Originally Posted by susan_cassidy

You don't have to extract the text from the mobi file, the text from an annotation is already in the "My Clippings" file. You can look at the code in the DRM-stripping script to find out how to decrypt the DRM'ed file.

Notjohn is talking about kindle.amazon.com.

1) This is what I have in my clippings file:

Quote:

<You have reached the clipping limit for this item>
Baltasar Gracian, A Pocket Mirror for Heroes, pg. 124, loc. 1481-1487

2) I've already decrypted the book, is there a way to decrypt the mbp1 file that holds all the annotations? I believe I can use tools to scrape that file once its decrypted into a normal mbp file. But I wasn't aware anyone had figured this out already.

@NotJohn, this is a different issue.

kyzcreig · 06-26-2015, 01:31 PM

Update: Great success!

I've leveraged DRM Decryption for Calibre to create a raw HTML version. Then I used the backed up Locations data to extract imprecise chunks and used BeautifulSoup4 to clean it up and remove redundancy.

06-19-2015, 11:03 PM	#1
kyzcreig Enthusiast Posts: 33 Karma: 12694 Join Date: Sep 2014 Device: kindle paperwhite	Getting around DRM, encoding? DRM is a real nuisance for us paying customers. I like to curate my notes, and usually do so after reading a great book. So you can imagine my surprise when I realized 90% of my annotations had been ignored. Fortunately the annotations are still visible in the Kindle and the location data is in tact in my clippings.txt file. This gave me the idea of taking the location information for each annotation and then extracting the appropriate text from the original mobi file via a script. My understanding is that location corresponds to 128 bytes of data, so it should be straight forward to put all this information into a file. But I'm not sure how it's encoded and when I use something like UTF it's a half garbled mess. I'm novice programmer though so I'm wondering: A) if this is actually feasible B) how hard it will be to decode mid-book excerpts As for the DRM itself, I've found tools for stripping it but I'm not sure if that will corrupt the location information. From what I can tell it doesn't.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
What character encoding am I seeing?	Claghorn	Conversion	1	08-22-2012 11:02 AM
Encoding problem	Mixx	Recipes	0	07-30-2011 06:27 AM
Encoding	prusaks	Recipes	0	09-27-2010 07:25 AM
how to tell the character encoding???	rheostaticsfan	Calibre	23	06-21-2010 04:26 PM
Need help with text encoding	daesdaemar	Workshop	12	12-31-2008 12:54 PM

06-20-2015, 06:45 AM	#2
Notjohn mostly an observer Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle	When I want to retrieve my highlights (and comments, though I rarely comment), I just Google kindle highlights amazon and am wafted to the appropriate website. Then I go copy > paste in my choice of word processors, often Note Tab Pro. Those highlights seem to stay there forever, even for library books whose loan period has expired. There's a lot of excess verbiage (Delete This Highlight / Read More At Location N) but that's easily ignored or stripped out.

06-23-2015, 04:57 PM	#3
susan_cassidy Wizard Posts: 2,251 Karma: 3720310 Join Date: Jan 2009 Location: USA Device: Kindle, iPad (not used much for reading)	You don't have to extract the text from the mobi file, the text from an annotation is already in the "My Clippings" file. You can look at the code in the DRM-stripping script to find out how to decrypt the DRM'ed file. Notjohn is talking about kindle.amazon.com.

06-26-2015, 01:31 PM	#5
kyzcreig Enthusiast Posts: 33 Karma: 12694 Join Date: Sep 2014 Device: kindle paperwhite	Update: Great success! I've leveraged DRM Decryption for Calibre to create a raw HTML version. Then I used the backed up Locations data to extract imprecise chunks and used BeautifulSoup4 to clean it up and remove redundancy.

Advert

Advert