12-21-2015, 06:20 PM | #151 | |
Connoisseur
Posts: 57
Karma: 10
Join Date: Dec 2011
Device: Samsung Tablet
|
Quote:
[ ]?‘(?i)(\d\d|ad|at|appen[a-z]*|ard|ave|bout|bye|cause|cept|cos|cuz|couse|ere|ea rd|em|er|e|ee|ell|f|fraid|fore|id|ighness|im|is|is self|gainst|kay|less|mongst|n|nd|neath|nough|nothe r|nuff|ood|ome|ow|ope|oney|orse[a-z]*|puter[a-z]*|round|scuse|spect[a-z]*|scaped|sides|tween|specially|t|taint|til|tis|twa s|twere|twould|twill|ud|un|urt)([\p{P}|\s])[/CODE] Been playing a little more, I've added a couple of others and added [a-z]* to some to catch things like ‘orse ‘orses ‘orseflesh, ‘appen ‘appened, ‘57 ‘88. Last edited by Steadyhands; 12-21-2015 at 07:20 PM. |
|
12-22-2015, 02:59 AM | #152 |
Connoisseur
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
|
Hi,
can someone help me refine the FixP, FixE, FixW, FixO, FixF for greek letters? As there are now they made some unnecessary changes. I want the to search and make changes in a whole word (start, middle, end), now sometimes they make changes and if the word is in the dictionary and sometimes and in hyphen words. I notice the unnecessary changes after i add a (\w*|\s) at the start and end if the Code:
CorrectText("ώ fixes",r"(\w*|\s)(οί\)|νο'\)|α\)|οδ|οό|ιυ|άί|ο5|ο'\)|ιίι|\(ό|ο\)|ίό|ο>|ο'ι|ιό|οί|ιο|οι|<ο|οϊ)(\w*\s)(?![^<>]*>)(?!.*<body[^>]*>)", IsFixO) There is any other solution to get the CorrectText to search in the start-middle-end of the word? Thanks Last edited by gipsy; 12-22-2015 at 03:03 AM. |
12-22-2015, 09:57 AM | #153 | |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
Plugin updated to version 2.0.0.5
The plugin has been updated in the first post in this thread as follows:
@Steadyhands: Quote:
"We’ll be seeing ‘em, I said to ‘er" I have replaced ([\p{P}|\s]) in your code with (\W?). |
|
12-23-2015, 05:24 PM | #154 | |
Connoisseur
Posts: 57
Karma: 10
Join Date: Dec 2011
Device: Samsung Tablet
|
I've been working on the word list and the regex associated with them. The range [a-z]* gives too broad a match and will result in too many false positive returns. I've converted these to repetition matches in greedy mode. i.e. app[yines]{0,5} will match appy and appiness, and e[emr]{0,1} will match e, ee, em, er. Still not 100% perfect but much better then before. There are still issues with false positives for text like ‘At the sound of’ and ‘Is it real’ where you need to lookaround the text and that is the next improvement.
Probably should be stated that this regex works for double curly quote formatted books but not single - and I prefer to step through the file rather than a replace all. Quote:
|
|
01-03-2016, 10:20 AM | #155 | |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
Updates to the plugin include:
The plugin will never be 100% perfect - unfortunately sometimes judgement is needed to correct the final text; the intention of the plugin was to automatically tidy up common errors in ePub files that had been OCR'd to speed up the process of correcting text and then use manual tweaks to correct any remaining errors. Quote:
Some users may wish to copy and paste the code by Steadyhands into the "Find" box in Sigil and do a manual search and replace where the quotes used for speech are single curly quotes. The plugin will do a 'replace all' for ePub books in which quotes used for speech are not single curly quotes. |
|
12-23-2016, 11:37 AM | #156 |
Enthusiast
Posts: 25
Karma: 10
Join Date: Mar 2014
Device: Pocket Book Touch Lux 4
|
Manual and other question
Hi folks
Sorry for the maybe silly questions! Where can I find the manual which the plugin does reference? Is there a way to process all html|xhtml files in an epub at once? Thanks in advance! |
09-01-2017, 02:16 PM | #159 |
Enthusiast
Posts: 25
Karma: 10
Join Date: Mar 2014
Device: Pocket Book Touch Lux 4
|
Cannot import the PIL library
Hi
I'd have to install a new Win 10 according to a crash of one of my Hds. Since this I get always I run the tidy-tool the message Status: success Cannot import PIL library Python Laucher Version: 20170227 Published version 2.0.0.5A Installed version 2.0.0.5A You have the latest version of this plugin =========================================== Cannot find the dictionary for your language. Dictionaries for English (UK), English (USA), French, German and Spanish are installed with Sigil. If you need a dictionary for a different language please install an appropriate Hunspell dictionary in the folder hunspell_dictionaries in the Sigil program folder. =========================================== My Language is German and the dictionaries are installed... Has anyone an idea what to do? Much thanks in advance. |
09-01-2017, 02:32 PM | #160 |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
As far as the PIL import error; make sure the "Use Bundled Python" checkbox is checked in Preferences->Manage Plugins.
|
09-01-2017, 03:06 PM | #161 |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Apparently the plugin doesn't check the supported spell-check languages. For example, it offers Greek as a language option, even though no Greek dictionary is installed on my machine.
I.e., even though you can select German from the drop-down list, the plugin might not have found the German dictionaries. (It looks for de_DE.aff and de_DE.dic in the hunspell_dictionaries folder, which exists in both the main Sigil program folder and the Sigil preferences folder.) a) To display the main Sigil folder right-click the Sigil desktop icon and select Dateipfad öffnen (Open file location). b) To display the Sigil preferences folder open Sigil, press F5 and and click Ordner Einstellungen öffnen (Open Preferences Location). Double-check that de_DE.aff and de_DE.dic exist in at least one of the hunspell_dictionaries folders. Last edited by Doitsu; 09-01-2017 at 03:09 PM. |
06-05-2018, 12:59 AM | #162 |
Hedge Wizard
Posts: 800
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Problem
Hi
I use ePubtidy a lot but sometimes I get the problem illustrated by the attached image. If you gave time available could you have a look at it please. |
06-05-2018, 01:27 PM | #163 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
@Thasaidon
I am not sure what the problem is - please describe the issue. Is it that the names of the included tags go off the scroll box? |
06-05-2018, 08:05 PM | #164 |
Hedge Wizard
Posts: 800
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Sorry. I should have given more of an explanation.
Basically it is not possible to access all the tags on the right because they go off the screen and I cannot access the 'OK' button This would not matter if I could access the "OK" button at the bottom of the screen as I could access those tags that are visible and click on "OK". I could then run the plugin again and hopefully the number of tags would have been reduced enough so I can access the rest. |
06-06-2018, 11:51 AM | #165 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
@Thasaidon:
I will try to find some time this coming weekend to look at this issue. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Tidying Up My Kindle | selectortone | Calibre | 2 | 07-17-2013 10:35 AM |
developping a Plugin for Presentation files | abdlink | Plugins | 4 | 04-15-2013 11:27 AM |
Plugin to fix fb2 files | oviksna | Plugins | 3 | 01-28-2013 08:53 AM |
Tidying Up My Library | JayLaFunk | Library Management | 2 | 09-20-2011 09:12 AM |
Calibre 0.7.50 can't see plugin files | mb_webguy | Calibre | 5 | 04-29-2011 03:41 AM |