Quote:
Originally Posted by rkomar
I have no idea what watermarking means in the context of an epub file. I assume it wouldn't produce a faint image over the text on every page, would it? If it just occurs in the cover image, or is just a comment somewhere in the HTML, it would be trivial to remove. So, what is epub watermarking?
|
It's a way of encoding identifying information into the ePub. While there may be some text in the frontmatter indicating that the ePub is watermarked, that's the tip of the iceberg.
There are an almost infinite number of ways to watermark a file.
Yes, HTML comments, but that's an elementary level. Similarly metadat entries in the OPF.
How about file names? Entity IDs in the Opf file? CSS style names? The order of certain tags in the HTML when the order isn't significant. Extra spans in the text, Meta info in JPEGs. Multiple CSS styles that are defined the same (or trivially different) and the order in which they're used codes info. Unused id attributes in XHTML tags.
I think you now get the idea. The only way to strip out watermarking would be to re-write the entire ePub, changing all names of styles and entities, changing the order or anything where the order is irrelevant, and manually checking for CSS styles that should be merged, and removing empty or unused tags and ids.
And even then there might be something that would get missed.
Good watermarking is very thorough. And completely pointless.