MobileRead Forums - View Single Post - Scrambling copyright ebooks to help troubleshoot problems ???

jackie_w · 10-19-2015, 01:47 PM

Quote:

Originally Posted by HarryT

I was under the impression that the goal was to produce a book that's structurally similar to the original, but from which the original couldn't be identified.

Quote:

Originally Posted by eschwartz

I thought the goal was to produce a book that's structurally similar to the original, but doesn't run afoul of copyright law when uploaded to MobileRead for debugging purposes.

There's a fairly straightforward decision to be made, I think. Troubleshooting an ebook problem will almost certainly require the book to be exploded into its constituent parts for further investigation.

Which of the following are true from MR's POV?

There must be no visible sign of Author, Title, ISBN (and maybe other metadata yet to be identified):

when reading the book on a reading device or with a more generic book viewer, e.g. calibre Viewer.
when the troubleshooter views the exploded ebook contents, using their troubleshooting tool of choice.
Is Publisher's name also a 'must-not-see' item?

Quote:

The moderating team has discussed the question you raised, and we're happy to give you the go-ahead for this suggested work, provided that the following conditions are met:

- All text, table of contents, headings, textual metadata, etc, must be replaced with scrambled text.
- All images must be replaced with scrambled images or removed.

Provided that your suggested plug-in produces output which satisfies these conditions, there would be no problem in uploading such output to MobileRead's forums.

Based on HarryT's post #63 (part re-quoted above) I'm pretty sure item 1 is true.

What about item 2? This is a more challenging project (not quite ready to give up yet, though

)

I've seen many books where the ISBN is hard-coded into the filename of nearly every constituent file in the book (HTML, CSS, NCX, images), including the OPF file itself.

I think the calibre utilities should be able to handle the issues detailed below (need to make sure, though).

If the HTML filenames contain the ISBN then any internal link anchor reference will also contain the ISBN e.g.
Code:
```
<a href="name_with_ISBN.html" id="id1">
```
If the image filenames contain the ISBN then the HTML image alt attributes may also contain the ISBN, because some books derive the alt from the filename (who knows why?) e.g.
Code:
```
<img alt="name_with_ISBN" src="name_with_ISBN.jpg">
```
There's also the HTML title attribute to consider for possible revealing info, e.g.
Code:
```
<h2 title="BookTitle AuthorName ISBN, Chap 1">Chapter 1</h2>
```
Comments in the CSS and/or HTML files may contain revealing info

Does there come a point where the troubleshooter doesn't have enough of the original to work with to solve the problem? Hard to say until you try it.

From my own experience, many of the 'this book doesn't work right on my Kobo/Sony' problems I've been involved with have come down to idiotic HTML/CSS decisions by the ebook creator. Once the book's in front of me the cause of the problem becomes clear quite quickly. Text content is usually irrelevant to the process. Others may have different experiences.