Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle > Kindle Developer's Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 06-19-2015, 10:04 PM   #1
kyzcreig
Enthusiast
kyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 31
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
Getting around DRM, encoding?

DRM is a real nuisance for us paying customers. I like to curate my notes, and usually do so after reading a great book. So you can imagine my surprise when I realized 90% of my annotations had been ignored.

Fortunately the annotations are still visible in the Kindle and the location data is in tact in my clippings.txt file. This gave me the idea of taking the location information for each annotation and then extracting the appropriate text from the original mobi file via a script. My understanding is that location corresponds to 128 bytes of data, so it should be straight forward to put all this information into a file. But I'm not sure how it's encoded and when I use something like UTF it's a half garbled mess.

I'm novice programmer though so I'm wondering:

A) if this is actually feasible
B) how hard it will be to decode mid-book excerpts

As for the DRM itself, I've found tools for stripping it but I'm not sure if that will corrupt the location information. From what I can tell it doesn't.
kyzcreig is offline   Reply With Quote
Old 06-19-2015, 10:30 PM   #2
knc1
Helpdesk Junkie
knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.
 
knc1's Avatar
 
Posts: 8,061
Karma: 7025886
Join Date: Feb 2012
Device: Too many.
I think that the subject of "getting around" or otherwise defeating DRM is against the site rules.

But here is a slide show of the various types of block encryption:
http://www.utdallas.edu/~muratk/cour...iles/modes.pdf

To answer your question, you have to know which of the above types is used by the DRM you are interested in making random access too.

For that information, you'll have to go to some other source of information than MobileRead.
Sorry, we don't disturb other people's I.P. here.

- - - -

You had better check your prior source(s) of information, that is most likely **bits** not **bytes** (block sizes are usually referred to by their **bit length** in cryptology but I don't have a clue what is the common practice in DRM methods).

Last edited by knc1; 06-19-2015 at 10:40 PM.
knc1 is offline   Reply With Quote
Old 06-21-2015, 05:00 PM   #3
kyzcreig
Enthusiast
kyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 31
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
Quote:
Originally Posted by knc1 View Post
I think that the subject of "getting around" or otherwise defeating DRM is against the site rules.

But here is a slide show of the various types of block encryption:
http://www.utdallas.edu/~muratk/cour...iles/modes.pdf

To answer your question, you have to know which of the above types is used by the DRM you are interested in making random access too.

For that information, you'll have to go to some other source of information than MobileRead.
Sorry, we don't disturb other people's I.P. here.

- - - -

You had better check your prior source(s) of information, that is most likely **bits** not **bytes** (block sizes are usually referred to by their **bit length** in cryptology but I don't have a clue what is the common practice in DRM methods).
Interesting, so I don't necessarily need to decrypt anything. To put it more succinctly I want to use Amazon's location values to extract passages from a .mobi, then decode them into legible text. The DRM is something slightly different although I'm also interested in it.

Though maybe the existing anti-DRM solutions will render the above impossible due to loss of interstitial meta data.

Last edited by kyzcreig; 06-21-2015 at 07:58 PM.
kyzcreig is offline   Reply With Quote
Old 06-21-2015, 05:58 PM   #4
knc1
Helpdesk Junkie
knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.
 
knc1's Avatar
 
Posts: 8,061
Karma: 7025886
Join Date: Feb 2012
Device: Too many.
Quote:
Originally Posted by kyzcreig View Post
So I actually don't want to decrypt anything, I believe what I would make would have general utility.

To put it more succinctly I want to use Amazon's location values to extract passages from text.

DRM is irrelevant here and of course removing it wouldn't solve my problems either.
The passages would be a bit hard to read if they were encrypted and you didn't decrypt them.

Plus, DRM was mentioned in the title of this thread.
Which is the reason I thought it would be relevant.

- - - -

Ah, which leaves only a question of the sort of measure that Amazon is using.

It might be bytes or characters.
For the starting location, either would be possible.
For the length, either would be possible.
("possible" because nothing is going to translate or convert the text encoding between making the notation and looking it up.)

My own first guess would be starting location in bytes and length in characters (remember, Kindles handle multi-byte character sets).

I don't know but a bit of experimenting (on a non-DRM protected document) should tell you what types of measurement units are being used.

- - - -

If the same code was to be used for both DRM and non-DRM protected documents - -
then the values would be two part values:
Block number and Displacement (in either bytes or characters) into the Block.

So a bit (no pun intended) of research into the block size that Amazon uses would still be required.
It should be easy to find experimentally, at least for a non-DRM protected document.

- - - -

Two part position (and length) value systems are common in file systems.
I.E: first block number:depth in bytes of start
similar for length and/or ending position.

Last edited by knc1; 06-21-2015 at 06:07 PM.
knc1 is offline   Reply With Quote
Old 06-21-2015, 08:40 PM   #5
eschwartz
Irrational Optimist
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 13,823
Karma: 53676858
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The necessary rules should be documented in calibre's code. The page number scheme has been cracked already, though it remains low-interest... however, the Kindle device driver for calibre includes a feature for calculating pseudorandom page numbers and generating a matching APNX file.
eschwartz is offline   Reply With Quote
Old 06-22-2015, 07:52 AM   #6
knc1
Helpdesk Junkie
knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.knc1 ought to be getting tired of karma fortunes by now.
 
knc1's Avatar
 
Posts: 8,061
Karma: 7025886
Join Date: Feb 2012
Device: Too many.
^^ Thanks ^^
It was details about Calibre that I had no idea about.
knc1 is offline   Reply With Quote
Old 06-25-2015, 02:22 AM   #7
kyzcreig
Enthusiast
kyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 31
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
I can confirm the LOC data corresponds to 150 byte chunks, not 128 bytes as I previously thought. I've also managed to decrypt the book and convert to raw HTML. But this leaves me with the presky problem of cleaning the text up.

There's a lot of damaged markup in each of these chunks. Any suggestions on how to deal with this? Or perhaps there's a tool that would automatically scrape the appropriate text, given byte offsets?

Edit: BeautifulSoup saves the day!! Imprecision aside, I've got everything working and I think I might post this on the internet to help other people out.

Last edited by kyzcreig; 06-25-2015 at 03:31 AM.
kyzcreig is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting around DRM, encoding? kyzcreig Kindle Formats 4 06-26-2015 12:31 PM
What character encoding am I seeing? Claghorn Conversion 1 08-22-2012 10:02 AM
Encoding prusaks Recipes 0 09-27-2010 06:25 AM
how to tell the character encoding??? rheostaticsfan Calibre 23 06-21-2010 03:26 PM
how to add encoding? nsg Calibre 5 02-25-2009 09:51 PM


All times are GMT -4. The time now is 09:01 AM.


MobileRead.com is a privately owned, operated and funded community.