MobileRead Forums - View Single Post

varlog · 07-31-2016, 05:51 AM

Quote:

Originally Posted by KevinH

...
For each starting tag, you use the tag attributes and look for xml:lang or lang and push a tuple of start tag name and current language into the end of a list. When a closing tag happens, you pop off the last tag name and the language. (Start the list with the metadata language).

When text comes you split it at word boundaries, as is done now, and you simply look at the bottom of that list to determine the current language associated with that text, passing the word and language to the spellcheck engine.
...

I'm doing something like this: at the moment SetNewBook initializes parser with parentTag, which has lang attribute grabbed from dc:language.
The parser is used by HtmlSpellCheck::GetMultilanguageMisspelledWords, which is my version of the GetMisspelledWords, to build tag stack with attributes.
The XHTMLHighlighter::highlightBlock calls XHTMLHighlighter::CheckSpelling which calls, through HtmlSpellCheck::GetMisspeledWords, HtmlSpellCheck::Get(Multilanguage)MisspeledWords giving it chunks (lines) of text. All is well as long as it provides the whole text: parser, being serial, manages the chunks: if one have not the whole tag in it, it waits for the next (and the next, and the next...) one. It builds stack, sets appropriate language.
But when you start to edit something in CodeView, the XHTMLHighlighter delivers only the line being changed: it could be something like this: " \t\t\t\t</a> Merici! ".

So what to do? Silently abort parser and wait for better times? Disable temporary highlighting? Other ideas?