Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old Yesterday, 02:07 PM   #1
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 11,082
Karma: 76037135
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Removing text based on CSS

From a thread in the FanFicFare plugin:

Quote:
Originally Posted by ownedbycats View Post
Somebody made a tool for "anti-AI scraping" (or scrapping, lol) which messes up the HTML on stories.

I put the demo into fanficfare and it came back pretty much gibberish at the end.

Don't know if anything can be done, thought should be aware just in case this becomes more widespread and you start getting "why is fff adding gibberish?"

Quote:
Originally Posted by JimmXinu View Post
Set use_workskin:true fixes it for me. Or rather, it applies the same 'de-obfuscation' CSS that is applied in browser.

Considering that hiding additional text in HTML for purposes of shaping search results has been a thing for decades, I honestly don't expect this tool to help all that much with it's stated purpose of AI scrape poisoning. I expect serious scrape tools will also apply CSS.
In case this tool somehow becomes more widespread (honestly I'm against it as it renders stories inaccessible to people who use screenreaders) I'm wondering if there's a way to take this a step further and remove anything that's hidden by the CSS - filesize bloating is my main concern, but also because I'm not sure works using it will work properly on my Kobo.


I'm providing my FFF-downloaded copy of the demo work.

WO3 Styling Debug - tbvns.epub

Click image for larger version

Name:	2025-08-21 15_03_21-WO3 Styling Debug [EPUB 2] __ WO3 Styling Debug - tbvns.epub __ Edit book.png
Views:	21
Size:	10.1 KB
ID:	217611

Click image for larger version

Name:	2025-08-21 15_03_33-WO3 Styling Debug [EPUB 2] __ WO3 Styling Debug - tbvns.epub __ Edit book.png
Views:	21
Size:	20.1 KB
ID:	217612

Code:
<p>Now<span class="c-ffffff fs-5">q</span>it's<span class="c-ffffff fs-5">q</span>time<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>write,<span class="c-ffffff fs-5">q</span>this<span class="c-ffffff fs-5">q</span>is<span class="c-ffffff fs-5">q</span>just<span class="c-ffffff fs-5">q</span>basic<span class="c-ffffff fs-5">q</span>text<span class="c-ffffff fs-5">q</span>because<span class="c-ffffff fs-5">q</span>I<span class="c-ffffff fs-5">q</span>need<span class="c-ffffff fs-5">q</span>some<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>test<span class="c-ffffff fs-5">q</span>lots<span class="c-ffffff fs-5">q</span>of<span class="c-ffffff fs-5">q</span>internal<span class="c-ffffff fs-5">q</span>system.</p>
<p class="c-ffffff fs-1">mental buys ebooks ii moore immediately una terror painful double beastality roy seed authority inf executive bay homes rep create</p>
<p class="c-000000 fs-0">uv apps mb beach salmon photography cams rather attached trial dragon sperm ski greece abraham scratch hung claimed cs thrown hc seeing small ak consortium mountain substantially england supply cherry</p>
<p class="c-ffffff fs-1">sucks realistic sector consider definitely lack units lead engage cancer inform duncan specific tradition extends vocals interactive opponent mighty cardiac willing macintosh camcorders highlights served enters chancellor boards everybody</p>
<p>I<span class="c-ffffff fs-5">q</span>know<span class="c-ffffff fs-5">q</span>that<span class="c-ffffff fs-5">q</span>it<span class="c-ffffff fs-5">q</span>probably<span class="c-ffffff fs-5">q</span>won't<span class="c-ffffff fs-5">q</span>do<span class="c-ffffff fs-5">q</span>anything<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>you,<span class="c-ffffff fs-5">q</span>but<span class="c-ffffff fs-5">q</span>I<span class="c-ffffff fs-5">q</span>swear<span class="c-ffffff fs-5">q</span>it<span class="c-ffffff fs-5">q</span>will<span class="c-ffffff fs-5">q</span>be<span class="c-ffffff fs-5">q</span>bad<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>AI<span class="c-ffffff fs-5">q</span>100%.</p>
<p class="c-ffffff fs-1">names marijuana conditioning hazard card helicopter microwave marco hart filtering dates seo ac trusts prospects floating motion tribune blink probe isaac members token oasis</p>
<p class="c-000000 fs-0">rfc proved parliament wired tommy nickname hockey lunch churches hostels analyzed subtle assets photos mating gotten geology rights concerning tradition utility aqua ls velvet vista casey brutal</p>
<p class="c-ffffff fs-1">reporters while calculation bukkake literacy familiar years please names population clothing yemen techrepublic individually wav scottish fe refuse favourite plugin with elsewhere disposition removal antique valves electro</p>
<p>I<span class="c-ffffff fs-5">q</span>want<span class="c-ffffff fs-5">q</span>to<span class="c-ffffff fs-5">q</span>make<span class="c-ffffff fs-5">q</span>things<span class="c-ffffff fs-5">q</span>so<span class="c-ffffff fs-5">q</span>bad<span class="c-ffffff fs-5">q</span>for<span class="c-ffffff fs-5">q</span>them<span class="c-ffffff fs-5">q</span>that<span class="c-ffffff fs-5">q</span>AO3<span class="c-ffffff fs-5">q</span>be<span class="c-ffffff fs-5">q</span>actually<span class="c-ffffff fs-5">q</span>unscrapable.</p>
ownedbycats is offline   Reply With Quote
Old Yesterday, 02:14 PM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,961
Karma: 147448039
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Sounds awful!
JSWolf is offline   Reply With Quote
Old Yesterday, 02:16 PM   #3
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 11,082
Karma: 76037135
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Quote:
Originally Posted by JSWolf View Post
Sounds awful!
Yeah, I'm against it-- several people reported that it makes unusable for people using screenreaders.
ownedbycats is offline   Reply With Quote
Old Yesterday, 04:20 PM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,961
Karma: 147448039
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by ownedbycats View Post
Yeah, I'm against it-- several people reported that it makes unusable for people using screenreaders.
It also looks like it makes it unsuitable for people reading the text,
JSWolf is offline   Reply With Quote
Old Yesterday, 05:23 PM   #5
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 11,082
Karma: 76037135
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Quote:
Originally Posted by JSWolf View Post
It also looks like it makes it unsuitable for people reading the text,
It's readable in the Calibre viewer, as long as you're not using dark mode:

Click image for larger version

Name:	2025-08-21 18_22_19-WO3 Styling Debug [EPUB] — E-book viewer.png
Views:	17
Size:	51.0 KB
ID:	217615

Click image for larger version

Name:	2025-08-21 18_22_27-WO3 Styling Debug [EPUB] — E-book viewer.png
Views:	17
Size:	52.3 KB
ID:	217616
ownedbycats is offline   Reply With Quote
Old Yesterday, 10:00 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,434
Karma: 27757438
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can use Transform HTML in the conversion settings to remove such things. But it wont happen automatically, you have to setup the rules for it.
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Removing colored text jkrzok Calibre 2 07-09-2015 01:40 PM
removing links from text . rolgiati Sigil 14 01-18-2015 07:09 PM
Help removing bold text tecweston Sigil 5 02-08-2012 12:33 PM
Removing text from an ebook mjt57 Conversion 3 04-29-2011 02:55 AM
Removing embedded font/overwrite some css? silentguy Conversion 4 01-25-2011 12:41 PM


All times are GMT -4. The time now is 01:06 PM.


MobileRead.com is a privately owned, operated and funded community.