Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 07-22-2015, 09:16 AM   #1
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 203
Karma: 62362
Join Date: Jul 2015
Device: Sony
Spelling dictionary and plugins

I need to access the spelling dictionaries (default and user ones) in a plugin that I intend to produce.

How can I access the words in these dictionaries in a plugin?

Thanks.
CalibUser is offline   Reply With Quote
Old 07-22-2015, 09:36 AM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Hi,

Is it just the user words and Hunspell dictionaries (.affl, .dic) files you want?
If so, you could read and parse them by a python script from the user's Preferences location (or shared dictionary location on Linux).

Or do you want to do actual spell checking?

If you want to do actual spell-checking, that would be much more difficult.
The plugin interface simply passes information about the currently open book location to a python environment and so is not a bi-directional call interface.

The python environment can manipulate the files and then creates an XML response telling the Sigil C++ environment which files it needs to copy or change.

Spell checking inside the Sigil app is done via a HunSpell interface. The easiest way for you to use HunSpell inside a python plugin would be to either find a Hunspell Python interface package (see the Python Package Index) (or a pure python spell-checker and then get permission to include it in your plugin OR use python "ctypes" calls to access the HunSpell dynamic library version that comes with Sigil.

KevinH

Quote:
Originally Posted by CalibUser View Post
I need to access the spelling dictionaries (default and user ones) in a plugin that I intend to produce.

How can I access the words in these dictionaries in a plugin?

Thanks.
KevinH is offline   Reply With Quote
Old 07-22-2015, 12:38 PM   #3
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 203
Karma: 62362
Join Date: Jul 2015
Device: Sony
Thanks for your response. I want to write plugin that will remove hyphens from words that should not be hyphenated. To do this the plugin will scan through the epub and when it finds a hyphenated word it will remove the hyphen and then see if the word without the hyphen exists in the dictionary. If it does, then it will remove the hyphen from the word in the ePub.

I've decided to use the python "ctypes" calls to do this job and I found a helpful site at http://thispageintentionally.blogspo...-hunspell.html.

This site uses the following code to load the library:

<code>
import os
# set up path strings to a dictionary
dpath = '/Users/dcortes1/Desktop/scratch'
daff = os.path.join(dpath, 'en_US.aff')
ddic = os.path.join(dpath, 'en_US.dic')
print( os.access(daff,os.R_OK), os.access(ddic,os.R_OK) )
# Find the library -- I know it is in /usr/local/lib but let's use
# the platform-independent way.
import ctypes.util as CU
libpath = CU.find_library( 'hunspell-1.3.0' )
# Get an object that represents the library
import ctypes as C
hunlib = C.CDLL( libpath )
</code>

Unfortunately this produces the error when I run it in a Sigil plugin:

Error: bad argument type for built-in operation

Where is the error in this code?

Thanks
CalibUser is offline   Reply With Quote
Old 07-22-2015, 01:44 PM   #4
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
FWIW, calibre's plugin environment is a lot more closely bound to the editor, since they both share the same python environment. In calibre, editor plugins can directly call the spellchecker.

In fact, you don't even need a plugin -- see: The power of function mode - using a spelling dictionary to fix mis-hyphenated words.
eschwartz is offline   Reply With Quote
Old 07-22-2015, 02:04 PM   #5
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Hi,
Is this with python 2.7 or python 3.4? There are changes to ctypes code needed to make it work with python 3.4 strings. See further along in your reference link as an example.

I would debug this code (the full code from your example) outside of the plugin environment by getting it to work in a straight python program first, using the exact same python version 2.7 or 3.4 you want the plugin to work under. Be careful as some linux systems now install python3 as just python and have renamed the old python to python2.

Running python at the terminal prompt should tell you which is found first in your path and what version it is.

Post you full standalone example here and I will test it onmy MacOS machine to see exactly what error you are getting.

Just make sure you are using the correct python version your code expects.

Quote:
Originally Posted by CalibUser View Post
Thanks for your response. I want to write plugin that will remove hyphens from words that should not be hyphenated. To do this the plugin will scan through the epub and when it finds a hyphenated word it will remove the hyphen and then see if the word without the hyphen exists in the dictionary. If it does, then it will remove the hyphen from the word in the ePub.

I've decided to use the python "ctypes" calls to do this job and I found a helpful site at http://thispageintentionally.blogspo...-hunspell.html.

This site uses the following code to load the library:

<code>
import os
# set up path strings to a dictionary
dpath = '/Users/dcortes1/Desktop/scratch'
daff = os.path.join(dpath, 'en_US.aff')
ddic = os.path.join(dpath, 'en_US.dic')
print( os.access(daff,os.R_OK), os.access(ddic,os.R_OK) )
# Find the library -- I know it is in /usr/local/lib but let's use
# the platform-independent way.
import ctypes.util as CU
libpath = CU.find_library( 'hunspell-1.3.0' )
# Get an object that represents the library
import ctypes as C
hunlib = C.CDLL( libpath )
</code>

Unfortunately this produces the error when I run it in a Sigil plugin:

Error: bad argument type for built-in operation

Where is the error in this code?

Thanks

Last edited by KevinH; 07-22-2015 at 06:09 PM.
KevinH is offline   Reply With Quote
Old 07-22-2015, 03:45 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,869
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
In addition to KevinH's advice, keep in mind that there are some platform-specific differences to the find_library and CDLL functions.

On OSX and Linux, find_library(name) expects the name parameter to be without a 'lib' prefix and without any suffixes like '.so' or '.dylib', or any appended version numbers. Windows has no shared library prefix, so if I recall, you'd need to use the whole filename minus the extension there.

In addition, find_library will return the full path to the shared library (if found) on Windows and OSX, while Linux will only return the file name portion.

So in your *nix example: "find_library('hunspell-1.3.0')" will likely be looking for a shared library with the name 'libhunspell-1.3.0.so.x' if there is no such library on your system (where your system keeps its shared libraries), it's going to return None.--and pass None to CDLL. Are you sure that's the exact version of hunspell's shared library that you have installed on your system?

Last edited by DiapDealer; 07-22-2015 at 06:34 PM.
DiapDealer is online now   Reply With Quote
Old 07-22-2015, 05:58 PM   #7
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,763
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
@DiapDealer: This may be a stupid question: where does the Sigil Windows installer install the Windows Hunspell dll that Sigil uses? (It's not in the same folder as the other .dll files.)
Doitsu is offline   Reply With Quote
Old 07-22-2015, 06:07 PM   #8
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Hi Doitsu,
It may be statically linked into the Sigil executable. I have not actually looked to check.
KevinH

Quote:
Originally Posted by Doitsu View Post
@DiapDealer: This may be a stupid question: where does the Sigil Windows installer install the Windows Hunspell dll that Sigil uses? (It's not in the same folder as the other .dll files.)
KevinH is offline   Reply With Quote
Old 07-22-2015, 06:20 PM   #9
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,008
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
whoops - wrong thread

Last edited by BetterRed; 07-22-2015 at 06:22 PM.
BetterRed is offline   Reply With Quote
Old 07-22-2015, 06:38 PM   #10
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,869
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Hunspell is built and statically linked into the Sigil executable on Windows. It's the same on Linux and OSX unless a "use local libs" switch is used at build time. Same with PCRE.
DiapDealer is online now   Reply With Quote
Old 07-23-2015, 02:00 AM   #11
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,763
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by DiapDealer View Post
Hunspell is built and statically linked into the Sigil executable on Windows. It's the same on Linux and OSX unless a "use local libs" switch is used at build time. Same with PCRE.
This means that the OP would either need to bundle the Hunspell dll with the plugin or only use the custom dictionary.

@CalibUser: The user dictionary is a simple text file. For more information on its usage also see this related thread.
You may want to check out Beautiful Soup, which you can embed in a plugin. For a simple example, see this throwaway plugin.
For algorithm ideas also check out this Epub spell checker, which appears to be using lexicostatistics to identify OCR errors.
Doitsu is offline   Reply With Quote
Old 07-23-2015, 07:00 AM   #12
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 203
Karma: 62362
Join Date: Jul 2015
Device: Sony
Thanks for all your replies.

It was the function in Callibre for removing unwanted hyphens that made me decide to write one as a plugin for Sigil, partly as an exercise for learning Python and partly because I thought it would be a useful plugin for others to download from this site.

However, I'm wondering if I am being too ambitious as I have only just started to learn Python!

Following on from DiapDealer's first post I tried to find the library for Hunspell using Windows search and it could not find it. When I saw the second post stating that 'Hunspell is built and statically linked into the Sigil executable on Windows' I realised that my code was looking for a library that doesn't exist, so I won't post my non-functional code here. I will need to explore the suggestions made by Doitsu ie to bundle the Hunspell dll with the plugin or only use the custom dictionary. Thanks for the links, Doitsu. I will follow them up.

BTW - What is OP?
CalibUser is offline   Reply With Quote
Old 07-23-2015, 07:36 AM   #13
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,763
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by CalibUser View Post
BTW - What is OP?
OP = Original Poster i.e. you.

Quote:
Originally Posted by CalibUser View Post
However, I'm wondering if I am being too ambitious as I have only just started to learn Python!
There is no such thing as being too ambitious. The good thing about Python is that there are a gazillion of ready-made libraries that you only need to import. (It's almost like using Lego bricks.)

And in the rare case that no ready-made library exists, usually DiapDealer and KevinH will come up with some helpful ideas. (I couldn't have finished my very simple plugins without their help.)
Doitsu is offline   Reply With Quote
Old 07-23-2015, 07:40 AM   #14
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 203
Karma: 62362
Join Date: Jul 2015
Device: Sony
Thanks for your encouragement, Doitsu. I will definitively explore the suggestions posted here.
CalibUser is offline   Reply With Quote
Old 07-23-2015, 08:03 AM   #15
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,070
Karma: 6361556
Join Date: Nov 2009
Device: many
Hi,
That said ... if spell checking inside a plugin is important, we can change the Sigil build to use a dynamic hunspell lib instead of a static one. Alternatively we could include our own dll with hooks back into needed routines such as spellchecking.

I will look into doing this for a future release.

KevinH
KevinH is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
About language and spelling roger64 Editor 6 08-26-2014 12:22 PM
check spelling Divingduck Editor 99 05-13-2014 12:26 AM
Spelling errors and such starrlamia General Discussions 29 11-29-2010 03:59 AM
Seriously thoughtful Spelling contractions SameOldStory Lounge 47 09-08-2010 09:08 PM
Spelling Macro PieOPah Workshop 36 12-13-2008 02:27 AM


All times are GMT -4. The time now is 06:13 PM.


MobileRead.com is a privately owned, operated and funded community.