11-08-2018, 02:10 PM | #181 |
Groupie
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Thousands of thanks!
Proceeding to download and try--I am grateful for the sheer notion of this plugin. |
11-08-2018, 05:01 PM | #182 | ||||
Groupie
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
A couple snags
Installed and tested, with a couple of snags:
1. First run produced the following error message: Code:
Incorrect XHTML: OEBPS/Text/Chap_09.htm Line/Col 17,30 @16:30: Tokenizer error with an unimplemented error message. Incorrect XHTML: OEBPS/Text/Chap_21.htm Line/Col 19,30 @18:30: Tokenizer error with an unimplemented error message. Quote:
2. The text contains Quote:
Quote:
Quote:
Thus, two questions: 1. How to enter in KeepHyphen.txt multi-expressions like s-p-a-t? Like this s-p p-a a-t or is there some shortcut? 2. What about plurals? To be on the safe side, I entered both cow-puncher cow-punchers but I have a hazy notion that it may be redundant--i.e., that a search for the first may sometimes include the second...? Thanks! |
||||
Advert | |
|
11-09-2018, 01:38 PM | #183 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
@carmenchu: Thank you for alerting me to the error in my plugin.
I have made a correction and placed the updated plugin in the first post in this thread. |
11-10-2018, 06:37 AM | #184 |
Groupie
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
More about hyphens
Thanks! No more issues with alt=""...
Now, I have another suggestion about hypens: in some books one finds several instances of words spelled out (maybe for added emphasis) like "r-a-t-s", or stuttered, like "p-p-please". The Plugin will check "r-a-t-s" Code:
HyphenRemoved=m.group(1)+m.group(2) Now, I never found an instance of a publication with a hyphen either after or before a single letter, like w-ord or wor-d--where it grammatically sound, which I doubt, there seems to be some styling rule against it. Thus, my suggestion: is it possble to check those m.group() for number of characters, and keep the hyphen if either is a single char? I don't know python, is it difficult to do? However, it makes sense to me--and, by the way, it would take care of the "I-I" special case... Thanks again! |
11-10-2018, 02:08 PM | #185 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
@carmenchu: Following your suggestion, the plugin will not remove hyphens between words where one of the words is a single character.
The updated plugin in the first post in this thread. |
Advert | |
|
11-10-2018, 03:57 PM | #186 |
Groupie
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Thanks again!
I only hope I'm not being a bore...
|
11-11-2018, 05:14 AM | #187 |
Groupie
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
The attached image is the icon I have made for showing this plugin in Sigil's toolbar.
It's not a work of art, but anything more sophisticated didn't show well in the little button--this at least can be identified at a glance among my other installed plugins. For what it's worth... Last edited by carmenchu; 11-11-2018 at 05:15 AM. Reason: correct word |
11-11-2018, 05:17 AM | #188 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
|
11-11-2018, 05:22 AM | #189 |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
|
01-09-2020, 03:34 PM | #190 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jan 2020
Device: samsung s2
|
ePub Tidy Tool v3
Hi, sorry to ask about ePub Tidy Tool plugin here - the original thread is closed.
Today I used the "chapter titles" button to change the format of chap titles - but somehow all the occurrences of "chapter" were altered - not only the chapter titles. And there are quite a few "chapter" occurrences inside a book I worked on, therefore quite a few paragraphs containing "chapter" were changed. Is there a way to add more restrictions to this function? e.g., change only occurrences with all-cap "CHAPTER" or only for the first few lines of a file, not to a long paragraph with chapter tucked inside. Thanks, Sean Last edited by DiapDealer; 01-12-2020 at 06:50 AM. Reason: The correct thread is not "closed" |
01-12-2020, 01:06 PM | #191 | |
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
Quote:
|
|
07-15-2020, 03:55 AM | #192 |
Evangelist
Posts: 425
Karma: 77256
Join Date: Sep 2011
Device: none
|
Hello,
Many thanks for this terrific plugin. Incredibly handy. Would it be possible to add some options such that anything the plugin does, such as fix common OCR, PDF export, or HTML errors, can be selectively enabled, that is one has precise control over the enabling of all things? In the case of vector-quality commercial PDF exports, OCR as well, I prefer to use my own regexes and find errors as they occur. Some maybe OCR errors, some may be a common PDF export error, some may be a typo in the original source, etc. ; in each case, I'd prefer to find them myself on the off chance the common fix isn't correct. In the meantime, I removed the lines of code for my use and I think I got them all as the log didn't report any changes except the ones I wanted. I recently found and used this plugin solely for hyphenation. On that note, calibre uses the eBook itself, scanning for words and compiling a dictionary. Would you someday consider such a feature? Many works – academic, scientific, and so forth –, may have unique terms, either from the field itself, transliterated from another language, Latin terms, etc. that not in any dictionary. Would be nice to have. I had first tried calibre but prefer not to convert. In the meantime, I converted the EPUB to text, created a word list, and used that. Last edited by democrite; 07-15-2020 at 03:59 AM. |
07-15-2020, 06:31 AM | #193 | |||
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
Quote:
Quote:
The code for this plugin has many different search/replace terms for correcting errors. It would require a really large number of checkboxes to implement your suggestion, and then the code would need to examine each checkbox to determine which corrections to implement. Regretfully, this would take me too long to implement and test, so I cannot add this feature to the plugin. Quote:
What does Calibre do with the dictionary it has compiled? Does this dictionary consist only of hyphenated words? On several occasions I have thought that it would be useful to have a dictionary of hyphenated words that need to be kept in the ePub as sometimes the hyphen is removed from some words where I want to keep the hyphen. This is why the plugin gives a list of all the words that have had the hyphen removed - I copy these words to notepad and then do a search/replace to put the hyphen back in to the one or two words where I want to keep the hyphen. Fortunately this does not happen for too many words in a given epub. I did consider the possibility of producing a dictionary of hyphenated words that were not to be replaced, but then I found that I had an ever-growing list of words to put in the dictionary and decided that it would be too time consuming to finalise a dictionary for this purpose. |
|||
07-15-2020, 04:42 PM | #194 | |
Evangelist
Posts: 425
Karma: 77256
Join Date: Sep 2011
Device: none
|
Quote:
As I haven't looked at the code I'm not sure exactly what calibre does. It certain compiles a word list to fix words as your plugin does that are line-break hyphenated in the PDF, e.g. "read- ing". Such is invaluable for certain works, such as the one I made recently of a scientific work containing countless latin terms and specialized vocabulary. Perhaps it too fixes hyphenated words such as "yellow- green". I would guess such could be a fair amount of work but simpler than what you suggest. I would guess maybe it'd be useful to also keep track of number of word occurrences in case of possible source typos, picking the more common one. |
|
07-16-2020, 05:22 AM | #195 | ||
Addict
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
|
Quote:
Quote:
|
||
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Tidying Up My Kindle | selectortone | Calibre | 2 | 07-17-2013 10:35 AM |
developping a Plugin for Presentation files | abdlink | Plugins | 4 | 04-15-2013 11:27 AM |
Plugin to fix fb2 files | oviksna | Plugins | 3 | 01-28-2013 08:53 AM |
Tidying Up My Library | JayLaFunk | Library Management | 2 | 09-20-2011 09:12 AM |
Calibre 0.7.50 can't see plugin files | mb_webguy | Calibre | 5 | 04-29-2011 03:41 AM |