![]() |
#31 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: Obvious Is Obvious When It's Obvious
[REALITY] Had a conversation with my college after a nasty showdown with a client: it turned out that what was obvious for him was not at all for me and vice versa [/REALITY]
The old question is answered. When you invoke SpelcheckEditor (clicking on icon or per Tools->Spellcheck->Spellcheck) it calls MainWindow::SpellcheckEditorDialog() which calls m_SpellcheckEditor->show(). Now the SpellcheckEditor has a showEvent(QShowEvent *event) function - which is apparently catching this show event - which calls Refresh() function which calls CreateModel(sort_column, sort_order) function which uses Code:
QHash<QString, int> unique_words = m_Book->GetUniqueWordsInHTMLFiles() The SpellcheckEditor m_Book variable, which was "0" by initialization, is, at this time, already changed by MainWindow which, in its initialization body, calls LoadInitialFile(openfilepath, is_internal) or CreateNewBook() functions. Both of them call, eventually, SetNewBook(QSharedPointer<Book> new_book) function which, in its initialization, uses Code:
m_SpellcheckEditor->SetBook(m_Book) Thus the SpellcheckEditor becomes its first book. No time travel, just some Qt event magic. Obvious. tbc...? |
![]() |
![]() |
![]() |
#32 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: Misery
yeah, yeah... the Stephen...
wrote a book today... not a great read... plagiary and google mostly... and the orthography is a disaster... attached, you can see for yourself. Not quite finished yet - have to add some void tags. Got dictionaries as I wanted them... but not as "The User" would probably want them... Attached picture of latest cockpit... spell check is knockout'ed of course. Decided on the way how to carry the language with the word... won't tell how because Kevin could protest and I want to see for myself... Nothing final, more difficult things ahead... still have to check it on slower system... misery... But I have my dictionaries now... time to use them. So let's tackle the pair HTMLSpellCheck - SpellCheck now. tbc...? |
![]() |
![]() |
Advert | |
|
![]() |
#33 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: The Colour of Magic
Terry... will have more of him...
The moments like this keep me coding. I can actually spell check again... multi spell check! Well, sometimes... when I get the word, the language and the dictionary right at the same time. I get the word as right as Sigill has ever done. I don't get the language right because, as Kevin has nicely put it, my parser is a "non-starter" at the moment. I don't get the dictionary right because of all those multiple dialect dictionaries. But nevertheless... Soul Music. tbc...? |
![]() |
![]() |
![]() |
#34 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,638
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
In Sigil you need to put some Structure into the Magic
![]() BR |
![]() |
![]() |
![]() |
#35 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
NLP...? no, not in this context, surly... ehm... ok, it's only a title.. nevertheless...
I'm well aware of it. This blog is supposed to document the process (one of the many possible) of developing the Structure. The magic sustains me. ![]() |
![]() |
![]() |
Advert | |
|
![]() |
#36 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: Witches Abroad
[REALITY]projects... the deadlines of the [PAST_IN_THEORY]current one[/PAST_IN_THEORY] are being (no time travel, alas) dealt with... they gave me a new one... starting now!... good for me...? not so for spellchecker...?[/REALITY]
Did some research on parsing html bodies. This, quite well known item, apparently, made me smile... and sad... Short version: everybody wants it, nobody has it, really, as I see it... The private Qt qtexthatmlparser parsing class has 2054 lines of C++ code. Kevin's quickparser.py has 202 lines of Python code. Added (empty!) class QuickHtmlParser to Sigil. This will be a ride... tbc...? |
![]() |
![]() |
![]() |
#37 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,495
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Note, both gumbo and qt have xhtml parsers but these typically build DOM trees which we don't need. For speed reasons, all we need is a serialing parser, that gives you the sequence of text and tags (including tag type) by repeated calls so you can extract the text while keeping track of all open tags and current language. This is why quickparser.py is so much simpler.
Again, for speed reasons, we need to parse the QString representing the file contents using QChars and pointers. Please don't convert it to utf-8, instead work with QChars and pointers into QChar vectors/arrays to process everything on a const readonly basis. You should be able to map the needed pieces of the quickparser.py to Qt QString and Qt QChar functions on almost a line by line (one to one) basis. If you run into trouble, just ask. Happy to help. KevinH |
![]() |
![]() |
![]() |
#38 | |||
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
Quote:
Was considering QXmlStremreader. Too slow? Too strict? Quote:
![]() Quote:
![]() |
|||
![]() |
![]() |
![]() |
#39 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
Out of the Source: The Structure of Magic
...actually I've read this one, when I was a teenager... now I think you can only learn magic if you are a magician... met too many corporate guys who were not...
Put some structure into my book because I was getting lost... not quite finished yet.. And the spelling seems to be out of control. My multiple dictionaries problem brought me to this. Shortcut, for now... But now [REALITY] calls ... tbc...? |
![]() |
![]() |
![]() |
#40 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,495
Karma: 5703586
Join Date: Nov 2009
Device: many
|
|
![]() |
![]() |
![]() |
#41 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
...
...[/REALITY]
umm, where I was...? A, QuickHtmlParser... Hmmm... Will take some time... for now... [REALITY]... tbc...? |
![]() |
![]() |
![]() |
#42 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: Interesting Times
The first one for me was the "Going Postal", as an audio book. I was immediately hooked.
But I agree with JSWolf: the proper reading order is this. I'm on the QuickHtmlParser... should be QuickSerialHtmlParser... but I had not much time and, when I had, I got distracted. Did some code refactoring because I wanted the dictionary part ready for the main event. And had some new ideas about it, of course. Reading Python code is a no-event for me (yes, I start with quickparser.py), have to ask Google WTF does it mean... exaggeration, of course... And speaking of distractions: I've managed to shoot my foot with my book, I think. tbc...? |
![]() |
![]() |
![]() |
#43 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: Thud!
I'm having my fun with Kevin's python quickparser.py and C++. He is using slice[i:i+1] where every other normal programing language would use [i]. Python? ...or missing something obvious, as usual?
But it's not the real problem. I was avoiding this topic till now. One of the main clients of HTMLSpellCheck class is XHTMLHighlighter class. It renders the content you see in code view. It uses the function void highlightBlock(const QString &text) which Code:
// Overrides the function from QSyntaxHighlighter; // gets called by QTextEditor whenever // a block (line of text) needs to be repainted The god-like status of QTextEditor is of course the matter of investigation... which is not the the aim of this exercise... But what really happens is: it could be that your html tag is not finished yet but the chunk of text provided is! The HTMLSpellCheck is called two times by Sigil start, my debugger tells me. This will need some investigation... tbc...? |
![]() |
![]() |
![]() |
#44 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,495
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Python 3 uses a single character "slice" to render that char as a character and not a integer representing the unicode code value. Python 2 does not need this but can lve wih it.
This is just something to make python code work on both python 2.7 and python 3 |
![]() |
![]() |
![]() |
#45 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: The Texas Chain Saw Massacre
actually, I've not seen the film, the title was enough...
well, it's not so bad, really... I tend to exaggerate... Still in QuickSerialHtmlParser... had my fun with git, which was obnoxiously protesting... trying to get it used to wild chunks of input... but the parsing of tags still craps out on Multilanguage.epub, so more work is due, before I start to do some real work... Changed its status from initially static to singleton. The copyright note on quickparser.py is: Code:
# Copyright (c) 2014 Kevin B. Hendricks, John Schember, and Doug Massay tbc...? |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Yet another new blog | Nate the great | Lounge | 0 | 05-01-2011 04:32 PM |
new to blog | pemmike | Introduce Yourself | 6 | 01-03-2011 05:39 AM |
Blog | AlexRupflin | Deutsches Forum | 10 | 12-24-2008 04:05 AM |
My first Blog....ever | AJ Starr | Introduce Yourself | 7 | 05-23-2008 02:55 AM |