![]() |
#1 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 33
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
|
Getting around DRM, encoding?
DRM is a real nuisance for us paying customers. I like to curate my notes, and usually do so after reading a great book. So you can imagine my surprise when I realized 90% of my annotations had been ignored.
Fortunately the annotations are still visible in the Kindle and the location data is in tact in my clippings.txt file. This gave me the idea of taking the location information for each annotation and then extracting the appropriate text from the original mobi file via a script. My understanding is that location corresponds to 128 bytes of data, so it should be straight forward to put all this information into a file. But I'm not sure how it's encoded and when I use something like UTF it's a half garbled mess. I'm novice programmer though so I'm wondering: A) if this is actually feasible B) how hard it will be to decode mid-book excerpts As for the DRM itself, I've found tools for stripping it but I'm not sure if that will corrupt the location information. From what I can tell it doesn't. |
![]() |
![]() |
![]() |
#2 |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,519
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
When I want to retrieve my highlights (and comments, though I rarely comment), I just Google kindle highlights amazon and am wafted to the appropriate website. Then I go copy > paste in my choice of word processors, often Note Tab Pro.
Those highlights seem to stay there forever, even for library books whose loan period has expired. There's a lot of excess verbiage (Delete This Highlight / Read More At Location N) but that's easily ignored or stripped out. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,251
Karma: 3720310
Join Date: Jan 2009
Location: USA
Device: Kindle, iPad (not used much for reading)
|
You don't have to extract the text from the mobi file, the text from an annotation is already in the "My Clippings" file. You can look at the code in the DRM-stripping script to find out how to decrypt the DRM'ed file.
Notjohn is talking about kindle.amazon.com. |
![]() |
![]() |
![]() |
#4 | ||
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 33
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
|
Quote:
Quote:
@NotJohn, this is a different issue. |
||
![]() |
![]() |
![]() |
#5 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 33
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
|
Update: Great success!
I've leveraged DRM Decryption for Calibre to create a raw HTML version. Then I used the backed up Locations data to extract imprecise chunks and used BeautifulSoup4 to clean it up and remove redundancy. |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What character encoding am I seeing? | Claghorn | Conversion | 1 | 08-22-2012 10:02 AM |
Encoding problem | Mixx | Recipes | 0 | 07-30-2011 05:27 AM |
Encoding | prusaks | Recipes | 0 | 09-27-2010 06:25 AM |
how to tell the character encoding??? | rheostaticsfan | Calibre | 23 | 06-21-2010 03:26 PM |
Need help with text encoding | daesdaemar | Workshop | 12 | 12-31-2008 11:54 AM |