![]() |
#1 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
[Conversion Plugin] Language Cleaner (profanity filter)
Lengthy list of regexes to "clean up" language in books.
Github: https://github.com/jdanders/calibre-...nguage-cleaner Latest release: https://github.com/jdanders/calibre-...eleases/latest I wrote this plugin because I don't like reading vulgar language, but I like reading books with vulgar language in it ![]() If you'd like to customize it to meet your preferences, you just need to go through the lines of cleaner.py and add or remove filters as needed. You'll probably need a pretty good mastery of regular expressions to write new ones unless there is a similar one existing already that you can tweak. To remove, just delete the lines that you don't want. LIMITATIONS I am no expert at calibre, and I could not drum up much help on the support forums, so the integration is pretty weak. It only works on books that are being converted from epub, and only works during the conversion process. To install: create a zip file with the three files called Language_Cleaner.zip
This command may help in Linux: Code:
zip Language_Cleaner cleaner.py __init__.py plugin-import-name-language_clean_plugin.txt Choose the zip you just created, and the plugin should show up under "File type plugins" To use:
Secret debug tip: If there is a "c:/Scratch/calibre" folder on your Winodws machine (change logdir in __init__.py if you want), the plugin will write before and after versions of the book as plain text files. Sometimes it does two copies and only one has useful changes. If you'd like to see how it was changed, compare the two files. I use WinMerge and that works well. By the way, there is a strong layer of irony here -- if vulgar language offends you, you'll probably want to avoid actually looking in the cleaner.py file, as it is chock full of it ![]() Here's a zip attachment, but it's really better to just go to the github page. Last edited by jdanders; 02-28-2024 at 03:39 PM. Reason: New 2024 release |
![]() |
![]() |
![]() |
#2 |
Member Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 805
Karma: 2091358
Join Date: May 2019
Device: Kindle Oasis 1st Gen, PB Era
|
If you are going to create, and maintain, a plugin, an up to date zip file is essential here.
Best of luck. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,761
Karma: 59473090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
You are not prohibited from hosting elsewhere, just be aware that many folk won't grab remote sources due to security concerns (If there IS a problem with one posted here at MR, We will hear it loudly and take it down fast and post warnings) |
|
![]() |
![]() |
![]() |
#4 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,421
Karma: 142887248
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
And the ZIP file needs to be attached to the first post in this thread, not the second.
|
![]() |
![]() |
![]() |
#5 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,761
Karma: 59473090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
Thanks for the help!
|
![]() |
![]() |
![]() |
#7 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
I update this a while ago, but it looks like the attachment is out of date still. I don't have enough right to edit the original post, so I'm adding the lastest zip here. Mods, could you fix it again? Maybe one day I'll be able to edit my own thread...
Last edited by BetterRed; 12-12-2020 at 09:59 PM. Reason: remove attachment |
![]() |
![]() |
![]() |
#8 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,491
Karma: 29308976
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Post some replies in the Lounge until you get edit permission. It's an anti-spam measure. BR |
|
![]() |
![]() |
![]() |
#9 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Dec 2021
Device: Android Tablet
|
It's a nice plugin, but what's the issue with God being mentioned at all?
|
![]() |
![]() |
![]() |
#10 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
I'm not sure what you mean. All automated content editing is difficult to do well, and religious words are the hardest because they have "good" and "bad" meanings.
The filter has two modes for the word "God", and the modes are guessed at by the general vulgarity of the entire book. When a vulgar book is detected, all instances of "God" are replaced assuming no one is going to use the word well. Otherwise a more subtle scheme is used. It's all very easy to change if you don't like the behavior. Instructions are included on the github page. |
![]() |
![]() |
![]() |
#11 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
See github for instructions on changing filtering behavior if you don't like the default. That's the nice thing about open source!
|
![]() |
![]() |
![]() |
#12 |
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
Hi, I still don't have permission to edit my post. Here's an updated version of the plugin with one bug fix. Maybe a moderator will be nice enough to update for me
![]() |
![]() |
![]() |
![]() |
#13 |
Plugin Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,743
Karma: 4600429
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
Moderator Notice
Zip updated as requested. |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Profanity In Books | twowheels | Lounge | 9 | 06-22-2015 12:46 AM |
fuseflt - A FUSE filesystem with file conversion filter support | twobob | Kindle Developer's Corner | 8 | 10-24-2013 10:06 AM |
Plugin to extract Language out of an epub | Invisibleman1964 | Plugins | 0 | 10-09-2012 01:58 PM |
Excessive profanity in books | JLeighs | General Discussions | 215 | 11-05-2010 08:24 PM |
Filter/search language of book? | athlonkmf | Calibre | 4 | 09-24-2010 07:20 AM |