View Single Post
Old 06-25-2015, 02:22 AM   #7
kyzcreig
Enthusiast
kyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterkyzcreig can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
Posts: 33
Karma: 12694
Join Date: Aug 2014
Device: kindle paperwhite
I can confirm the LOC data corresponds to 150 byte chunks, not 128 bytes as I previously thought. I've also managed to decrypt the book and convert to raw HTML. But this leaves me with the presky problem of cleaning the text up.

There's a lot of damaged markup in each of these chunks. Any suggestions on how to deal with this? Or perhaps there's a tool that would automatically scrape the appropriate text, given byte offsets?

Edit: BeautifulSoup saves the day!! Imprecision aside, I've got everything working and I think I might post this on the internet to help other people out.

Last edited by kyzcreig; 06-25-2015 at 03:31 AM.
kyzcreig is offline   Reply With Quote