|
|
#1 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
[Conversion Plugin] Language Cleaner (profanity filter)
Lengthy list of regexes to "clean up" language in books. Works as a profanity filter for Kindle and other ebooks.
Check out the README on Github for full details. Github: https://github.com/jdanders/calibre-...nguage-cleaner Latest release: https://github.com/jdanders/calibre-...eleases/latest I wrote this plugin because I don't like reading vulgar language, but I like reading books with vulgar language in it . Personally I find books much more enjoyable after being processed with this script. Obviously it is a personal set of filters, but I've done my best to make the changes sound as natural as possible, and after using it for years, I think it's pretty good.The configuration dialog allows you to toggle specific word families (like the f-bomb or religious exclamations) and add your own custom replacements using a built-in table. There is even a live tester so you can check your rules before saving. LIMITATIONS Language filtering is never perfect. Words are used in boundless combinations, and sometimes formatting tags inside words can hide them from the filters. It handles AZW3 and EPUB natively. For old-style MOBI files, you'll still want to convert to a newer format (like AZW3) first. To install:
To use:
Secret debug tip: In the config dialog, you can enable "Write change logs to directory". This writes before and after versions of the book as plain text files, which is great for auditing exactly what changed using a tool like WinMerge. By the way, there is a strong layer of irony here -- if vulgar language offends you, you'll probably want to avoid actually looking in the cleaner.py file, as it is chock full of it ![]() Here's a zip attachment, but it's really better to just go to the github page. Last edited by jdanders; Today at 01:20 AM. Reason: New 2026 release |
|
|
|
|
|
#2 |
|
Member Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 805
Karma: 2091358
Join Date: May 2019
Device: Kindle Oasis 1st Gen, PB Era
|
If you are going to create, and maintain, a plugin, an up to date zip file is essential here.
Best of luck. |
|
|
|
| Advert | |
|
|
|
|
#3 | |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,749
Karma: 64144480
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
You are not prohibited from hosting elsewhere, just be aware that many folk won't grab remote sources due to security concerns (If there IS a problem with one posted here at MR, We will hear it loudly and take it down fast and post warnings) |
|
|
|
|
|
|
#4 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,277
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
And the ZIP file needs to be attached to the first post in this thread, not the second.
|
|
|
|
|
|
#5 |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,749
Karma: 64144480
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
|
|
|
| Advert | |
|
|
|
|
#6 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
Thanks for the help!
|
|
|
|
|
|
#7 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
I update this a while ago, but it looks like the attachment is out of date still. I don't have enough right to edit the original post, so I'm adding the lastest zip here. Mods, could you fix it again? Maybe one day I'll be able to edit my own thread...
Last edited by BetterRed; 12-12-2020 at 09:59 PM. Reason: remove attachment |
|
|
|
|
|
#8 | |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,659
Karma: 33011292
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Post some replies in the Lounge until you get edit permission. It's an anti-spam measure. BR |
|
|
|
|
|
|
#9 |
|
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Dec 2021
Device: Android Tablet
|
It's a nice plugin, but what's the issue with God being mentioned at all?
|
|
|
|
|
|
#10 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
I'm not sure what you mean. All automated content editing is difficult to do well, and religious words are the hardest because they have "good" and "bad" meanings.
The filter has two modes for the word "God", and the modes are guessed at by the general vulgarity of the entire book. When a vulgar book is detected, all instances of "God" are replaced assuming no one is going to use the word well. Otherwise a more subtle scheme is used. It's all very easy to change if you don't like the behavior. Instructions are included on the github page. |
|
|
|
|
|
#11 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
See github for instructions on changing filtering behavior if you don't like the default. That's the nice thing about open source!
|
|
|
|
|
|
#12 |
|
Member
![]() Posts: 11
Karma: 10
Join Date: Sep 2019
Device: none
|
Hi, I still don't have permission to edit my post. Here's an updated version of the plugin with one bug fix. Maybe a moderator will be nice enough to update for me
|
|
|
|
|
|
#13 |
|
Plugin Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,414
Karma: 5007213
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
|
Moderator Notice
Zip updated as requested. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Profanity In Books | twowheels | Lounge | 9 | 06-22-2015 12:46 AM |
| fuseflt - A FUSE filesystem with file conversion filter support | twobob | Kindle Developer's Corner | 8 | 10-24-2013 10:06 AM |
| Plugin to extract Language out of an epub | Invisibleman1964 | Plugins | 0 | 10-09-2012 01:58 PM |
| Excessive profanity in books | JLeighs | General Discussions | 215 | 11-05-2010 08:24 PM |
| Filter/search language of book? | athlonkmf | Calibre | 4 | 09-24-2010 07:20 AM |