10-15-2015, 03:09 PM | #1 |
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Scrambling copyright ebooks to help troubleshoot problems ???
I have a question regarding trying to help users who post on these Forums about a problematic copyright ebook on their ereader of choice.
Does anyone think it would it be useful to have a simple utility which scrambles the contents of a de-DRM'd copyright ebook sufficiently that the Moderators would be happy for that scrambled copy to be attached on MR? The aim being to scramble the text content whilst leaving the ebook structure and styling intact to make troubleshooting the user's actual problem more straightforward for all who try to help? Any thoughts? Especially regarding what would constitute "enough scrambling" for the Moderators to be happy. If it's thought to be worth pursuing I could attach a first attempt at a Scrambler utility - for epub, kepub, azw3. Just FYI ... for my own private purposes I needed to practice using some of the Calibre ebook manipulation coding tools so I chose this as a little project. If it's not worth pursuing, that's OK, I needed the practice anyway In its current state it's not a Calibre plugin (although it easily could be) it's a drag-and-drop the ebook onto a Windows .bat file and a scrambled copy of the book is created on the PC. For the time being I decided to keep scrambled copies well away from the Calibre library. Some past observations: When a user posts asking for help with a perceived problem with a copyrighted book there often follows a frustrating sequence of events: e.g.
Last edited by jackie_w; 10-15-2015 at 03:15 PM. |
10-15-2015, 03:11 PM | #2 |
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
ScrambleEbook commandline
Purpose:
N.B: The upgrade to calibre 4.0 required some technical changes to ScrambleEbook. As a result it is no longer available as a standalone python script and so the old attachment to this post has been removed. Instead, a commandline option has been added to the ScrambleEbook calibre plugin. Those who wish to scramble a book which is present on their OS disk but not in any of their calibre libraries can do the following:
Last edited by jackie_w; 12-21-2020 at 07:19 PM. Reason: New instructons for standalone users |
10-15-2015, 03:24 PM | #3 |
Grand Sorcerer
Posts: 12,119
Karma: 73448614
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
Sounds interesting. If possible leave items like ToC entries / Headings un-scrambled.
Also I presume that rather than scrambling, the text would be replaced by random text (possibly Lorem Ipsum based). Might also be an idea for a Sigil plugin? |
10-15-2015, 04:51 PM | #4 | ||
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
Re: the NCX TOC - to scramble or not to scramble is trivial from a technical POV. The question would be what the Moderators think is acceptable. At the moment I've played safe and scrambled. Re: an inline TOC - this is more difficult as there is no 100% reliable programmatic way to determine whether the page with the most hyperlinks is an inline TOC or a page of Endnotes (or something else entirely). The Eyeball Method is very reliable but I'm not currently considering an interactive utility with endless user settings. I'd rather err on the safe side of scrambling an inline TOC than revealing the Endnotes. Re: Headings. That might be possible to some extent if only text from <h1>, <h2> etc tags were revealed but, as we all know, many books are not created with logical tags. Once again the Moderators would need to have their say. Until then, when in doubt ... scramble. Quote:
Perhaps, but only by someone other than me I haven't used Sigil since the Calibre Editor was released. |
||
10-15-2015, 05:31 PM | #5 |
Grand Sorcerer
Posts: 12,119
Karma: 73448614
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
So images are mangled too it would seem?
What are you thinking about for embedded fonts? |
10-15-2015, 05:52 PM | #6 | |
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
Re: embedded fonts I haven't done anything with this so far - mainly because I'm not clear what the best do-able solution would be. More thinking and Calibre code-digging required. I usually strip all embedded fonts myself so don't have a lot of experience. Is it possible to de-obfuscate an obfuscated font? If not then I'm not sure why they can't be left as-is. Non-obfuscated fonts are another matter. Some are OK to leave as-is (e.g. Charis), some are not. The safest legal thing to do would be to remove them all from the scrambled version. I'm sure this is do-able because Calibre Polish and Modify Epub already do it. Although I don't think there's an option to remove some fonts and not others. Last edited by jackie_w; 10-15-2015 at 07:16 PM. Reason: incorrect info |
|
10-15-2015, 11:34 PM | #7 |
Grand Sorcerer
Posts: 24,908
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
I like the idea and that is a very good start. The things I can think of:
- I think I'd like to see the ToC left alone. As this might be where the problem is, regenerating or playing with it might fix the issue. Or maybe not touch it of it is just "Chapter x". - The fonts are a problem and I haven't a clue on handling them. - I think it would be safe if the only images are the cover and a publishers logo. Both of these are easily obtainable and "fair use" probably applies. But, a book with a lot of images becomes a problem. |
10-15-2015, 11:54 PM | #8 |
Guru
Posts: 917
Karma: 417282
Join Date: Jun 2015
Device: kobo aura h2o, kobo forma
|
If you're going to scramble a book, you should change its title too just to not be confusing. Maybe prepend "scrambled" to it.
|
10-16-2015, 12:02 AM | #9 | |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
ToC is probably fair use.
As is the semantic cover image (easy to pinpoint with a plugin). Sounds like an interesting and useful plugin. Though I'd be happy too if moderators took the time to write a more descriptive warning. My form response for oversized images: Quote:
|
|
10-16-2015, 12:24 AM | #10 |
Wizard
Posts: 2,839
Karma: 22003124
Join Date: Aug 2014
Device: Kobo Forma, Kobo Sage, Kobo Libra 2
|
Only issue I can see with scrambling is someone reversing it and thus getting whatever books users post for free.
I don't know if it would really be worth their time and energy given the general ease of pirating digital content, but it's an objection I can see the moderators raising. Where as replacing the text of a book with a static block would be impossible to reverse, if you're worried about it not representing the text of the book maybe just generate word lists so there are a few words of each size to throw in the place of the text of the book. Even if the end user knows the entire word list, they'd have a monumental time trying to reconstruct the book from this since 'read' and 'said' could, in theory, both be replaced by 'boat'. |
10-16-2015, 01:13 AM | #11 |
Grand Sorcerer
Posts: 12,119
Karma: 73448614
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
From looking at the sample, I don't think the scrambling is based on the same letters being transposed; rather it looks like a random character is generated, with the only proviso being that words stay the same length.
|
10-16-2015, 04:51 AM | #12 |
Guru
Posts: 647
Karma: 4566069
Join Date: Jan 2010
Location: Sweden
Device: Kobo Forma
|
|
10-16-2015, 08:25 AM | #13 | ||
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
Quote:
At the moment punctuation, spaces and digits 0-9 are left unscrambled. In addition, every alpha-character retains its upper/lower case, so that paragraphs and word-wrapping look "realistic" ... but possibly Klingon-esque. |
||
10-16-2015, 08:38 AM | #14 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Technically, I'm pretty sure that this would still be copyright infringement. You'd be creating what copyright law calls a "derived work" - ie you're creating a new work from the copyrighted original - and you require the permission of the copyright holder to do that. The fact that it's not readable doesn't change that.
|
10-16-2015, 08:47 AM | #15 | |
Grand Sorcerer
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
IIRC Sigil used to make some fairly significant structural auto-changes during opening. Is this still the case? IMO this is not necessarily helpful when the aim is troubleshooting. Does anyone else have an opinion? |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sony ereader troubleshoot | chrisms | Sony Reader | 3 | 10-02-2013 02:08 PM |
Out of copyright ebooks/writers you'd most like to see digitized? | pstjmack | Reading Recommendations | 18 | 09-14-2012 08:46 PM |
whispersync not working: can anyone help me troubleshoot? | rheostaticsfan | Amazon Kindle | 4 | 10-27-2011 08:09 AM |
troubleshoot battery life | kkinser | Amazon Kindle | 2 | 04-18-2011 09:05 PM |
The copyright issues of fan fiction eBooks | Kirok | Lounge | 33 | 12-08-2008 06:54 PM |