View Single Post
Old 09-27-2012, 05:29 PM   #23
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 74,028
Karma: 315160596
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Oasis
Quote:
Originally Posted by rkomar View Post
I have no idea what watermarking means in the context of an epub file. I assume it wouldn't produce a faint image over the text on every page, would it? If it just occurs in the cover image, or is just a comment somewhere in the HTML, it would be trivial to remove. So, what is epub watermarking?
It's a way of encoding identifying information into the ePub. While there may be some text in the frontmatter indicating that the ePub is watermarked, that's the tip of the iceberg.

There are an almost infinite number of ways to watermark a file.

Yes, HTML comments, but that's an elementary level. Similarly metadat entries in the OPF.

How about file names? Entity IDs in the Opf file? CSS style names? The order of certain tags in the HTML when the order isn't significant. Extra spans in the text, Meta info in JPEGs. Multiple CSS styles that are defined the same (or trivially different) and the order in which they're used codes info. Unused id attributes in XHTML tags.

I think you now get the idea. The only way to strip out watermarking would be to re-write the entire ePub, changing all names of styles and entities, changing the order or anything where the order is irrelevant, and manually checking for CSS styles that should be merged, and removing empty or unused tags and ids.

And even then there might be something that would get missed.

Good watermarking is very thorough. And completely pointless.
pdurrant is online now   Reply With Quote