Also see Sigil/src/Misc/HTMLSpellCheck.cpp
In particular, the GetMisspelledWords method that walks an html file identifying text and using whitespace and boundary chars to parse it into words to check. This code should be modified to also parse tags and attributes and use that to build a stack that represents current language. The closest code to use as a model is in python called quickparser.py
see Sigil/src/Resource_Files/plugin_launchers/python/quickparser.py.
But any routine would need to written in C or C++ for speed given spellchecking can be done on the fly.
Hope this helps.
|