Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 09-23-2015, 01:22 PM   #61
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
@gypsy: I will look at the code needed to cover <span style="font-variant:small-caps;"> when I get time.

Not sure about the hyphen problem as I don't understand Greek characters. The code works by taking each hyphenated word from the ePub, removing the hyphen and then checking whether or not the word without hyphen exists in the dictionary. If the word exists in the dictionary then the method used returns the non-hyphenated word. I don't know why this is not working for Greek words. The plugin provided only reads one dictionary otherwise I would suspect a conflict between English and Greek dictionaries.
CalibUser is offline   Reply With Quote
Old 09-23-2015, 03:26 PM   #62
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
@CalibUser
I aim to minimize the spelling errors... Forget the hyphens... Let's say that in the epub we have the "acknovvledge" instead or "acknowledge"...

I look at the code. and you have
Code:
HyphenRemoved=m.group(1)+m.group(2)
If we want to automatically fix the acknovvledge... it's possible to have something like...
Code:
FixWord=m.group(1)+'w'+m.group(2)
Can we search for a word as (group1)vv(group2) and if it's at dictionary to change it automatically?

Thanks
gipsy is offline   Reply With Quote
Advert
Old 09-23-2015, 04:04 PM   #63
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
The style "font-variant:small-caps" is not recognised by all ePub readers. This style produces capitalised text that is slightly smaller than the main text. The plugin has been updated to include an option for processing span tags "Change to UPPER" that changes the text that has the style "font-variant:small-caps" to upper case. It also has another option for processing this style, "Change to small UPPER". This is described in more detail in the updated ePub manual for this plugin. The update for the plugin and the manual is in the first post in this thread.

@gipsy: You suggested correcting errors such as "acknovvledge" to "acknowledge" by spliting the mispelt word into two groups; it is more straightforward to use a regex expression to replace "acknovvledge" with "acknowledge". You could correct this error using this plugin by adding the code:

CorrectText("Changed acknovvledge to acknowledge", "acknovvledge", "acknowledge")
CalibUser is offline   Reply With Quote
Old 09-23-2015, 04:17 PM   #64
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
Not only with acknovvledge Calib :P That was a example in english.
In greek there are a few errors like that. That the word have a "ύ" instead of "έ" or a "ο" instead of "σ". In my last epub i had about 60+ errors like that when I spellcheck it.
This is the code that search for the hyphens?
Code:
			CorrectText("Hyphens removed",r"(?s)(\w+)[ ]?-[ ]?(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", IsHyphenated)
Where I can find what each item of the r"..." does so i can test it with the greek characters?

Thanks again

EDIT: Another example in greek...
I had the following Regex to fix some errors.
Code:
Find:(ΓΙ|Γΐ|ΙΙ|II|I\ I|I\ Ι\ΓΤ|ΙΊ|Ιί)
Replace:Π
But the Replace greek character it's possible to be and "H", if it's possible to use the dictionary in the Regex Find/Replace like... (group1)(ΓΙ|Γΐ|ΙΙ|II|I\ I|I\ Ι\ΓΤ|ΙΊ|Ιί)(group2) it would be incredibly useful

Last edited by gipsy; 09-23-2015 at 04:23 PM.
gipsy is offline   Reply With Quote
Old 09-24-2015, 01:47 AM   #65
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
@gipsy: Since you rely so much on the PCRE regex flavor used by Sigil, you may want to look into Saved Searches (Tools > Saved Searches), if you can live with the fact that Saved Searches won't give you detailed feedback.

Create a group, e.g. Greek and add all the regexes that you need. Later on simply open the Saved Searches dialog box, select the group heading and click Replace All.

@CalibUser: You may want to look into porting the regular expressions in your plugin to the new Python regex library, which offers some features that even PCRE lacks. Among them Levenshtein distance based fuzzy matches and case-insensitive Unicode matches. (The Python regex library will be included in the upcoming Sigil 0.8.9 release.)

You may want to also consider loading all regex expressions from a Saved Search group. This'll allow users to easily customize your plugin. (Saved Searches are stored in sigil_searches.ini.)

Last edited by Doitsu; 09-24-2015 at 01:49 AM.
Doitsu is offline   Reply With Quote
Advert
Old 09-24-2015, 01:56 AM   #66
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
@Doitsu I have the saved searches. But you must not replace them all because you get more spelling errors instead to reduce them! You must process it one by one.
But if you have the dictionary check in the fixes, you can replace them all and it saves you time
gipsy is offline   Reply With Quote
Old 09-24-2015, 04:42 PM   #67
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
I think I figure out how to make the regex work with the WordDictionary...
I test it with 4-5 words epub that I intentionally misspelled and it work. But i'm gonna test it some more times first in a full epub!


@CalibUser is the code correct?

Code:
def FixP(m):
	"""
	This function examines a word to see whether is required to fix the Π character that is misspelled.
	It is called by a regular expression function (re.sub) in FixCommonErrors()
	It returns the original expression if the checked word is not in the dictionary,
	otherwise it returns the word without the Π fixed
	"""
	FixP='Π'+m.group(2)
	if spell(FixP):
		print("FixP removed from: ", FixP)
		return(FixP)
	else:
		return(m.group(0))



		#Fixes Π in words that are misspelled
		if dictExists == True:
			CorrectText("Π fixes",r"(ΓΙ|Γΐ|ΙΙ|II|I\ I|I\ Ι|ΓΤ|ΙΊ|Ιί)[ ]?(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", FixP)
Thanks
gipsy is offline   Reply With Quote
Old 09-25-2015, 11:47 AM   #68
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
@gipsy: If I have understood your code correctly, the search expression will find a word that is preceded by character(s) in the search pattern and send this word to function FixP(). Let's call this word test word.
FixP() puts 'Π'in front of test word to form a new word (call it a check word) and then looks in the dictionary to check whether or not the check word is in the dictionary.
If the check word is in the dictionary then the function returns the check word, otherwise it returns the test word.

Eg If the word being examined is IITest, your code will send IITest to Fix(). If ΠTest is in the dictionary then your code will return ΠTest otherwise it will return IITest.

If this is what you wanted the code to do then it is correct.

@Doitsu: Thanks for the link relating to the new regex library for Python. I like the idea of importing regex expressions from a saved search group into the plugin; I originally developed this plugin because I had several different groups of saved searches and I wanted to run them all together. I will be looking into all your suggestions, time permitting.
CalibUser is offline   Reply With Quote
Old 10-01-2015, 03:06 AM   #69
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
I was trying to correct the Π fixes in my latest post here. Because i notice that in greek we have words that are correct with and without the the correction (for example "ΓΙΟΥ" "ΠΟΥ" are both correct words in greek).
The code from my latest post it changes the ΓΙΟΥ word to ΠΟΥ.

I change the code to this:
Code:
 
############FIXES Π###########
def FixP(m):
	FixP=m.group(1)+m.group(2)
def FixP2(m):
	FixP2='Π'+m.group(2)
	"""
	This function examines a word to see whether is required to fix the Π character that is misspelled.
	It is called by a regular expression function (re.sub) in FixCommonErrors()
	It returns the original expression if the checked word is not in the dictionary,
	otherwise it returns the word without the Π fixed
	"""

	if spell(FixP):
		return(FixP)
	else:
		print("FixP removed from: ", FixP2)
		return(FixP2)
Code:
		#Fixes Π in words that are misspelled
		if dictExists == True:
			CorrectText("Π fixes",r"(ΓΙ|Γΐ|ΙΙ|II|I\ I|I\ Ι|ΓΤ|ΙΊ|Ιί|ΓΊ)(\w+)[ ]?(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", FixP)
			CorrectText("Π fixes",r"(ΓΙ|Γΐ|ΙΙ|II|I\ I|I\ Ι|ΓΤ|ΙΊ|Ιί|ΓΊ)(\w+)[ ]?(\w+)(?![^<>]*>)(?!.*<body[^>]*>)", FixP2)

		
		if not html == html_orig: bk.writefile(id, html)	#If the text has changed then write the amended text to the book
I have the FixP for the word that is correctly spelling without the fix and FixP2 for the word for fix.
But when i try the plugin with the above code it doesn't return the FixP or FixP2, it leaves it blank (and it says that 3 Π's are corrected).
Can you help me solve it please? I have about 4-5 fixes like this if i manage to find a solution.

I attach a epub and a WordDictionary with test material :P

Thanks
Attached Files
File Type: zip FixP.zip (1.8 KB, 375 views)
gipsy is offline   Reply With Quote
Old 10-01-2015, 01:15 PM   #70
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
@gipsy: I will try to find some time this weekend to look at your file.
CalibUser is offline   Reply With Quote
Old 10-01-2015, 01:20 PM   #71
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
I have made three more updates to the ePub Tidy tool. The latest version will:

1. Allow the user to use a customised list of words that need to be corrected.
2. Allow the user to rename <h...> tags in selected html sections and strip out all other tags
3. Changed the following html codes to a single character: ‘ ’ “ ” —

The update is in the first post in this thread with an updated instruction manual to explain how to use the new version.
CalibUser is offline   Reply With Quote
Old 10-01-2015, 02:51 PM   #72
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
@CalibUser
When i try to add it from Manage plugins in sigil i get a "Error: Plugin not a valid Sigil plugin."

I simply extract it to plugin folder to check it :P
gipsy is offline   Reply With Quote
Old 10-01-2015, 02:58 PM   #73
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
@gipsy: Odd - it worked on my Windows 7 PC. I will check it again. I spotted an error in your code:

def FixP(m):
FixP=m.group(1)+m.group(2)

You cannot use an equal sign to return the value of a function; I think your expression will equate to 'None'. To return the groups in your expression from a function, you need to use:

return( m.group(1)+m.group(2) )
CalibUser is offline   Reply With Quote
Old 10-01-2015, 03:10 PM   #74
gipsy
Connoisseur
gipsy began at the beginning.
 
Posts: 81
Karma: 10
Join Date: Nov 2013
Device: Kobo Aura HD
i was trying to do something like...

if m.group(1)+m.group(2) is spell
then return m.group(1)+m.group(2)
else
if 'Π'+m.group(2) is spell
then return Π+m.group(2)
else return m.group(0)

But i can only copy-paste coding, i don't have any knowledge :P

I will try your suggestion
gipsy is offline   Reply With Quote
Old 10-01-2015, 03:11 PM   #75
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
The plugin should work now - there was an error in the filename that did not match the XML file in the plugin
CalibUser is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Tidying Up My Kindle selectortone Calibre 2 07-17-2013 10:35 AM
developping a Plugin for Presentation files abdlink Plugins 4 04-15-2013 11:27 AM
Plugin to fix fb2 files oviksna Plugins 3 01-28-2013 08:53 AM
Tidying Up My Library JayLaFunk Library Management 2 09-20-2011 09:12 AM
Calibre 0.7.50 can't see plugin files mb_webguy Calibre 5 04-29-2011 03:41 AM


All times are GMT -4. The time now is 06:57 PM.


MobileRead.com is a privately owned, operated and funded community.