From a thread in the FanFicFare plugin:
Quote:
Originally Posted by ownedbycats
Somebody made a tool for " anti-AI scraping" (or scrapping, lol) which messes up the HTML on stories.
I put the demo into fanficfare and it came back pretty much gibberish at the end.
Don't know if anything can be done, thought should be aware just in case this becomes more widespread and you start getting "why is fff adding gibberish?"
|
Quote:
Originally Posted by JimmXinu
Set use_workskin:true fixes it for me. Or rather, it applies the same 'de-obfuscation' CSS that is applied in browser.
Considering that hiding additional text in HTML for purposes of shaping search results has been a thing for decades, I honestly don't expect this tool to help all that much with it's stated purpose of AI scrape poisoning. I expect serious scrape tools will also apply CSS.
|
In case this tool somehow becomes more widespread (honestly I'm against it as it renders stories inaccessible to people who use screenreaders) I'm wondering if there's a way to take this a step further and remove anything that's hidden by the CSS - filesize bloating is my main concern, but also because I'm not sure works using it will work properly on my Kobo.
I'm providing my FFF-downloaded copy of the demo work.
WO3 Styling Debug - tbvns.epub
Code:
<p>Now<span class="c-ffffff fs-5">q</span>it's<span class="c-ffffff fs-5">q</span>time<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>write,<span class="c-ffffff fs-5">q</span>this<span class="c-ffffff fs-5">q</span>is<span class="c-ffffff fs-5">q</span>just<span class="c-ffffff fs-5">q</span>basic<span class="c-ffffff fs-5">q</span>text<span class="c-ffffff fs-5">q</span>because<span class="c-ffffff fs-5">q</span>I<span class="c-ffffff fs-5">q</span>need<span class="c-ffffff fs-5">q</span>some<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>test<span class="c-ffffff fs-5">q</span>lots<span class="c-ffffff fs-5">q</span>of<span class="c-ffffff fs-5">q</span>internal<span class="c-ffffff fs-5">q</span>system.</p>
<p class="c-ffffff fs-1">mental buys ebooks ii moore immediately una terror painful double beastality roy seed authority inf executive bay homes rep create</p>
<p class="c-000000 fs-0">uv apps mb beach salmon photography cams rather attached trial dragon sperm ski greece abraham scratch hung claimed cs thrown hc seeing small ak consortium mountain substantially england supply cherry</p>
<p class="c-ffffff fs-1">sucks realistic sector consider definitely lack units lead engage cancer inform duncan specific tradition extends vocals interactive opponent mighty cardiac willing macintosh camcorders highlights served enters chancellor boards everybody</p>
<p>I<span class="c-ffffff fs-5">q</span>know<span class="c-ffffff fs-5">q</span>that<span class="c-ffffff fs-5">q</span>it<span class="c-ffffff fs-5">q</span>probably<span class="c-ffffff fs-5">q</span>won't<span class="c-ffffff fs-5">q</span>do<span class="c-ffffff fs-5">q</span>anything<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>you,<span class="c-ffffff fs-5">q</span>but<span class="c-ffffff fs-5">q</span>I<span class="c-ffffff fs-5">q</span>swear<span class="c-ffffff fs-5">q</span>it<span class="c-ffffff fs-5">q</span>will<span class="c-ffffff fs-5">q</span>be<span class="c-ffffff fs-5">q</span>bad<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>AI<span class="c-ffffff fs-5">q</span>100%.</p>
<p class="c-ffffff fs-1">names marijuana conditioning hazard card helicopter microwave marco hart filtering dates seo ac trusts prospects floating motion tribune blink probe isaac members token oasis</p>
<p class="c-000000 fs-0">rfc proved parliament wired tommy nickname hockey lunch churches hostels analyzed subtle assets photos mating gotten geology rights concerning tradition utility aqua ls velvet vista casey brutal</p>
<p class="c-ffffff fs-1">reporters while calculation bukkake literacy familiar years please names population clothing yemen techrepublic individually wav scottish fe refuse favourite plugin with elsewhere disposition removal antique valves electro</p>
<p>I<span class="c-ffffff fs-5">q</span>want<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>make<span class="c-ffffff fs-5">q</span>things<span class="c-ffffff fs-5">q</span>so<span class="c-ffffff fs-5">q</span>bad<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>them<span class="c-ffffff fs-5">q</span>that<span class="c-ffffff fs-5">q</span>AO3<span class="c-ffffff fs-5">q</span>be<span class="c-ffffff fs-5">q</span>actually<span class="c-ffffff fs-5">q</span>unscrapable.</p>