Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-14-2016, 06:13 PM   #16
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: something wicked this way comes...

My real professional life caught up with me, so this week and the next one will be even more barren as usual - prio one, deadlines, stress etc.

Did that, was there. Fake-implemented unknown language for SpellcheckEditor.

Decided to face the dragons.

SpellcheckEditor, to get his words, uses function Book::GetUniqueWordsInHTMLFiles() which uses time travel and parallel universes (QFuture, QtConcurrent - scary!) to invoke Book::GetWordsInHTMLFileMapped(HTMLResource *html_resource) which calls HTMLSpellCheck::GetAllWords(html_resource->GetText()).

Thus we land in HtmlSpellCheck.



tbc...?
varlog is offline   Reply With Quote
Old 06-16-2016, 05:46 PM   #17
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: The Player of the Games

... yes, your suspicion is right... at the moment I'm reading Iain.

HtmlSpellCheck:
Not so scary after all. Just a collection of some static functions. The heart of it is:

static QList<MisspelledWord> GetMisspelledWords(const QString &text,
int start_offset,
int end_offset,
const QString &search_regex,
bool first_only = false,
bool include_all_words = false);

It has a fine word seeking loop... which is tag aware...

...must update my regex apparently...

...wish I had more time... it got so interesting...

...added "language" to struct MisspeledWord...let's play this one...


tbc...?

Last edited by varlog; 06-16-2016 at 05:59 PM.
varlog is offline   Reply With Quote
Advert
Old 06-21-2016, 05:37 PM   #18
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: Use of Weapons

.. you've surly seen it coming...

no time, no time... but:
I've spend a little time in HTMLSpellCHeck. The choice of weapons is:
- full quick html parser (as suggested by Kevin) versus shortcut (I need only "lang" atributte for this!).
- recursion versus some logic loop (to get the language right when leaving language sick tags).

Minimalistic, as I am, I go for shortcut (for now).
My modest experience with parsing html bodies tells me the recursion is the answer. But... recursion is something you do not do at home... will try logic for now.

But I have to go back to SpellCheck and SpellCheckEditor....

tbc...?

Last edited by varlog; 07-15-2016 at 08:24 PM. Reason: wrong word
varlog is offline   Reply With Quote
Old 06-22-2016, 01:11 PM   #19
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
You realize that lang attributes can be on tags with nested contents and even nested themselves. So a tag name stack must be built to properly handle all of these cases and to properly unwind nested tags with language attributes. So the shortcut approach is a non-starter.
KevinH
KevinH is offline   Reply With Quote
Old 06-22-2016, 03:04 PM   #20
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
It would be nice to have a sample ebook or, better, just a (long)snippet, with some not trivial usage of tags with language attribute. Can you spare a little time for it, Kevin? Anybody?
varlog is offline   Reply With Quote
Advert
Old 06-22-2016, 04:03 PM   #21
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
The xml:lang tag is virtually allowed anyplace. So I strongly recommend the approach of using a tag parser based on quickparser and keeping a fifo/stack of tag and lang.
KevinH is offline   Reply With Quote
Old 06-22-2016, 05:59 PM   #22
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
I'm using, among others, something like this:

Code:
<html xmlns="http://www.w3.org/1999/xhtml" lang="zombie">
<head>
  <title>Spell Checking Languages</title>
</head>
<body>
<p lang="fr">
<img alt="sigil" src="../Images/sigil.png" xml:lang="en"/>This is 
<span xml:lang="">Sigil </span>icone
</p>
</body>
</html>
I wanted to use:
Code:
<p lang="fr">
<img alt="sigil" src="../Images/sigil.png" xml:lang="en">This is 
<span xml:lang="">Sigil </span>icone</image> Merci!
</p>
but Sigil (0.9.5) doesn't like it.

Something more complicated, anybody?
varlog is offline   Reply With Quote
Old 06-22-2016, 06:51 PM   #23
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by varlog View Post
I wanted to use:
Code:
<p lang="fr">
<img alt="sigil" src="../Images/sigil.png" xml:lang="en">This is 
<span xml:lang="">Sigil </span>icone</image> Merci!
</p>
but Sigil (0.9.5) doesn't like it.
For once, this isn't Sigil's fault.

Your code is invalid. You'll need to use:

Code:
<p lang="fr">
<img alt="sigil" src="../Images/sigil.png" xml:lang="en" />This is 
<span xml:lang="en">Sigil </span>icone Merci!
</p>

Quote:
Originally Posted by varlog View Post
Something more complicated, anybody?
If you're looking for a mini test case how about this old multilingual joke:

Code:
  <p>Four linguists were sharing a compartment on a train on their way to an international conference on sound symbolism. One was English, one Spanish, one French and the fourth German. They got into a discussion on whose language was the most eloquent and euphonious.</p>

  <p>The English linguist said: "Why, English is the most eloquent language. Take for instance the word "butterfly". Butterfly, butterfly... doesn't that word so beautifully express the way this delicate insect flies. It's like flutter-by, flutter-by."</p>

  <p>"Oh, no!" said the Spanish linguist, "the word for "butterfly" in Spanish is "<span lang="es" xml:lang="es">mariposa</span>". Now, this word expresses so beautifully the vibrant <span xml:lang="en-GB" lang="en-GB">colours</span> on the butterfly's wings. What could be a more apt name for such a brilliant creature? Spanish is the most eloquent language!"</p>

  <p>"<span xml:lang="fr" lang="fr">Papillon</span>!" says the French linguist, "<span xml:lang="fr" lang="fr">papillon</span>! This word expresses the fragility of the butterfly's wings and body. This is the most fitting name for such a delicate and ethereal insect. French is the most eloquent language!"</p>

  <p>At this the German linguist stands up, and demands: "<span xml:lang="de" lang="de">Und</span> <span xml:lang="und" lang="und">vot is rongk</span> <span xml:lang="de" lang="de">mit</span> '<span xml:lang="de" lang="de">Schmetterling</span>'?"</p>
In case you're wondering why I used xml:lang and lang, it's an IDPF recommendation and und is an undetermined language.

If you ever get your code to work, everything tagged as "und" or "xzz" (=no linguistic content) shouldn't be spell-checked.
Doitsu is offline   Reply With Quote
Old 06-23-2016, 03:13 PM   #24
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
Quote:
Originally Posted by Doitsu View Post

...

Code:
 ...
  <p>"Oh, no!" said the Spanish linguist, "the word for "butterfly" in Spanish is "<span lang="es" xml:lang="es">mariposa</span>". Now, this word expresses so beautifully the vibrant <span xml:lang="en-GB" lang="en-GB">colours</span> on the butterfly's wings. What could be a more apt name for such a brilliant creature? Spanish is the most eloquent language!"</p>
  <p>"<span xml:lang="fr" lang="fr">Papillon</span>!" says the French linguist, "<span xml:lang="fr" lang="fr">papillon</span>! This word expresses the fragility of the butterfly's wings and body. This is the most fitting name for such a delicate and ethereal insect. French is the most eloquent language!"</p>
  <p>At this the German linguist stands up, and demands: "<span xml:lang="de" lang="de">Und</span> <span xml:lang="und" lang="und">vot is rongk</span> <span xml:lang="de" lang="de">mit</span> '<span xml:lang="de" lang="de">Schmetterling</span>'?"</p>
Not complicated but good enough to land in my book . I've laughed a bit, too...

Quote:
Originally Posted by Doitsu View Post
... everything tagged as "und" or "xzz" (=no linguistic content) shouldn't be spell-checked.
I haven't known that. Thanks.
varlog is offline   Reply With Quote
Old 06-23-2016, 04:33 PM   #25
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
just a note for quick reference:
void elements html4
area, base, basefont(d), br, col, frame, hr, img, input, isindex(d), link, meta, param: source
void elements html5
area, base, br, col, command, embed, hr, img, input, keygen, link, meta, param, source, track, wbr(d?): source mostly

Last edited by varlog; 06-23-2016 at 04:56 PM.
varlog is offline   Reply With Quote
Old 06-29-2016, 05:57 PM   #26
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: ...

It is not "The State of Art" phase for sure... and "Excession" is not happening anytime soon, either.
Was out for a few days.
I'm back into SpellCheck and SpellCheckEditor now.
It seems that the SpellCheck instance is first created by MainWindow initializing SpellCheckEditor which, in its initialization, calls, well, SpellCheck. I will use this, I think.

But what I do now is making SpellCheck language aware. That means it has to be able, singleton as it is (for now), to hold more than one Hunspell objectcs. What happens? At the moment I have two Hunspell's loaded and my system (SSD, 4 core, 8Gb) doesn't seem to notice... will have to have more...


tbc?

Last edited by varlog; 06-30-2016 at 06:27 PM.
varlog is offline   Reply With Quote
Old 06-30-2016, 03:14 AM   #27
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
I just want to say...this is WAAAAAAAAAAAAAAY better than those telenovelas...
Hitch is offline   Reply With Quote
Old 06-30-2016, 02:25 PM   #28
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
Why, thank you!
Also please note, I try too keep it suitable for all ages: no explicit crime or sex scenes .
varlog is offline   Reply With Quote
Old 06-30-2016, 03:09 PM   #29
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by varlog View Post
Why, thank you!
Also please note, I try too keep it suitable for all ages: no explicit crime or sex scenes .
Dang! And here I was hopin' that I was getting the BBC version, not the "bleached for the USA" version. :-)

I'm sure that my fellow addicts in watching this are appreciative of the G-rated effort. Don't know how you manage, myself. :-)

Hitch
Hitch is offline   Reply With Quote
Old 07-01-2016, 07:12 PM   #30
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: The Labirynth

[REALITY] Now we are a week after deadline and we are still not done... There will be blood...[/REALITY]

oops... blood...? well, still not explicit... is it?

Still in SpellCheck/SpellchecEditor.
Can load as many Hunspell's as I wish... and unload them, too. Funny thing is that, even though the size of average hunspell dictionary file is about 0.5 to 2.1 MB (the ones I have), the real memory footprint is something (my ad hoc htop tells me, there are better tools, I'm sure) like 5MB to 7MB. It's irrelevant, I was just curious. Could be debugger or anything.
The load times (SSD) are not noticeable. Will have to check on my laptop eventually (still HD, that is why I'm not using it anymore).
The maze I'm actually in is "User Experience" (G-rated)thing. For instance I have on my system something like twenty Spain dictionaries to choose from. And Language class doesn't know them all... and prefers "-" to "_".
Some "default language dictionary" is due. What a mess...
My instance of Sigil, due to my meddling, has lost its ability (temporary?) to spell check. What a mess...

tbc?
varlog is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Yet another new blog Nate the great Lounge 0 05-01-2011 04:32 PM
new to blog pemmike Introduce Yourself 6 01-03-2011 05:39 AM
Blog AlexRupflin Deutsches Forum 10 12-24-2008 04:05 AM
My first Blog....ever AJ Starr Introduce Yourself 7 05-23-2008 02:55 AM


All times are GMT -4. The time now is 05:26 AM.


MobileRead.com is a privately owned, operated and funded community.