Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 09-03-2019, 12:35 AM   #1
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
[Conversion Plugin] Language Cleaner (profanity filter)

Lengthy list of regexes to "clean up" language in books.

Github: https://github.com/jdanders/calibre-...nguage-cleaner
Latest release: https://github.com/jdanders/calibre-...eleases/latest

I wrote this plugin because I don't like reading vulgar language, but I like reading books with vulgar language in it . Personally I find books much more enjoyable after being processed with this script. Obviously it is a personal set of filters, but I've done my best to make the changes sound as natural as possible, and after using it for years, I think it's pretty good.

If you'd like to customize it to meet your preferences, you just need to go through the lines of cleaner.py and add or remove filters as needed. You'll probably need a pretty good mastery of regular expressions to write new ones unless there is a similar one existing already that you can tweak. To remove, just delete the lines that you don't want.

LIMITATIONS
I am no expert at calibre, and I could not drum up much help on the support forums, so the integration is pretty weak. It only works on books that are being converted from epub, and only works during the conversion process.

To install:

create a zip file with the three files called Language_Cleaner.zip
  • cleaner.py
  • __init__.py
  • plugin-import-name-language_clean_plugin.txt

This command may help in Linux:

Code:
zip Language_Cleaner cleaner.py __init__.py plugin-import-name-language_clean_plugin.txt
In calibre choose Preference -> Plugins -> Load plugin from file
Choose the zip you just created, and the plugin should show up under "File type plugins"
To use:
  • Choose the book you'd like and make sure you have an epub format (so convert to epub if you don't already have that format)
  • Now do "Convert book" and choose to convert from Epub to Epub (or whatever destination format you want)
  • Wait until longer than usual job completes, due the very inefficient way this plugin works

Secret debug tip: If there is a "c:/Scratch/calibre" folder on your Winodws machine (change logdir in __init__.py if you want), the plugin will write before and after versions of the book as plain text files. Sometimes it does two copies and only one has useful changes. If you'd like to see how it was changed, compare the two files. I use WinMerge and that works well.

By the way, there is a strong layer of irony here -- if vulgar language offends you, you'll probably want to avoid actually looking in the cleaner.py file, as it is chock full of it

Here's a zip attachment, but it's really better to just go to the github page.
Attached Files
File Type: zip Language_Cleaner.zip (8.2 KB, 2750 views)

Last edited by jdanders; 02-28-2024 at 03:39 PM. Reason: New 2024 release
jdanders is offline   Reply With Quote
Old 09-03-2019, 06:53 AM   #2
Bookstooge
Guru
Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.
 
Bookstooge's Avatar
 
Posts: 760
Karma: 2090886
Join Date: May 2019
Device: Kindle Oasis 1st Gen
If you are going to create, and maintain, a plugin, an up to date zip file is essential here.

Best of luck.
Bookstooge is offline   Reply With Quote
Advert
Old 09-03-2019, 11:32 AM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,802
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jdanders View Post
Here's a zip attachment, but it's really better to just go to the github page.
Calibre scrapes the plugin Index HERE AT MR and serves the plugin from HERE AT MR for automatic Update notifications.

You are not prohibited from hosting elsewhere, just be aware that many folk won't grab remote sources due to security concerns (If there IS a problem with one posted here at MR, We will hear it loudly and take it down fast and post warnings)
theducks is online now   Reply With Quote
Old 09-03-2019, 11:41 AM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,983
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
And the ZIP file needs to be attached to the first post in this thread, not the second.
JSWolf is offline   Reply With Quote
Old 09-03-2019, 11:44 AM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,802
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by JSWolf View Post
And the ZIP file needs to be attached to the first post in this thread, not the second.
Moderator Notice
Fixed that for you
theducks is online now   Reply With Quote
Advert
Old 09-04-2019, 12:16 AM   #6
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
Thanks for the help!
jdanders is offline   Reply With Quote
Old 12-12-2020, 07:10 PM   #7
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
I update this a while ago, but it looks like the attachment is out of date still. I don't have enough right to edit the original post, so I'm adding the lastest zip here. Mods, could you fix it again? Maybe one day I'll be able to edit my own thread...

Last edited by BetterRed; 12-12-2020 at 09:59 PM. Reason: remove attachment
jdanders is offline   Reply With Quote
Old 12-12-2020, 10:05 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by jdanders View Post
I update this a while ago, but it looks like the attachment is out of date still. I don't have enough right to edit the original post, so I'm adding the lastest zip here. Mods, could you fix it again? Maybe one day I'll be able to edit my own thread...
Fixed.

Post some replies in the Lounge until you get edit permission.

It's an anti-spam measure.

BR
BetterRed is offline   Reply With Quote
Old 12-18-2021, 10:59 AM   #9
IreneFa
Junior Member
IreneFa began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Dec 2021
Device: Android Tablet
It's a nice plugin, but what's the issue with God being mentioned at all?
IreneFa is offline   Reply With Quote
Old 12-19-2021, 12:29 AM   #10
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
I'm not sure what you mean. All automated content editing is difficult to do well, and religious words are the hardest because they have "good" and "bad" meanings.

The filter has two modes for the word "God", and the modes are guessed at by the general vulgarity of the entire book. When a vulgar book is detected, all instances of "God" are replaced assuming no one is going to use the word well. Otherwise a more subtle scheme is used.

It's all very easy to change if you don't like the behavior. Instructions are included on the github page.
jdanders is offline   Reply With Quote
Old 12-19-2021, 12:33 AM   #11
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
See github for instructions on changing filtering behavior if you don't like the default. That's the nice thing about open source!
jdanders is offline   Reply With Quote
Old 07-08-2023, 08:07 PM   #12
jdanders
Member
jdanders began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
Hi, I still don't have permission to edit my post. Here's an updated version of the plugin with one bug fix. Maybe a moderator will be nice enough to update for me
Attached Files
File Type: zip Language_Cleaner.zip (8.2 KB, 64 views)
jdanders is offline   Reply With Quote
Old 07-08-2023, 09:21 PM   #13
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 6,318
Karma: 3966249
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
Moderator Notice
Zip updated as requested.
JimmXinu is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Profanity In Books twowheels Lounge 9 06-22-2015 12:46 AM
fuseflt - A FUSE filesystem with file conversion filter support twobob Kindle Developer's Corner 8 10-24-2013 10:06 AM
Plugin to extract Language out of an epub Invisibleman1964 Plugins 0 10-09-2012 01:58 PM
Excessive profanity in books JLeighs General Discussions 215 11-05-2010 08:24 PM
Filter/search language of book? athlonkmf Calibre 4 09-24-2010 07:20 AM


All times are GMT -4. The time now is 04:22 PM.


MobileRead.com is a privately owned, operated and funded community.