![]() |
#46 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: The Texas Chain Saw Massacre 2
All would be well on its way, if I only were not applying the chain saw to my foot... <script>!... <![CDATA[ ]]>!... if (i < codeArray.length) {...} indeed!
But we live in XXI century now and there is no escaping epub3. And the effects are so cool... Just click on the [clicko me!] thing... the moment I saw it, I just had to implement it in my book! Those tags will be used. I created my very first C++ lambda. Here it is, in all its glory: Code:
auto skip_whiteys=[&pos,&text]()->void {while(text.at(pos).isSpace()) ++pos;}; hope that Mac/Win compilers are up to it... we speak (linux) gcc 4.8.4 here. tbc...? |
![]() |
![]() |
![]() |
#47 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,499
Karma: 5703586
Join Date: Nov 2009
Device: many
|
be careful .isSpace() is a different set that what the xhtml spec and html5 spec clearly state as whitespace.
|
![]() |
![]() |
![]() |
#48 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
Code:
static Q_DECL_CONSTEXPR inline bool isSpace(uint ucs4) Q_DECL_NOTHROW Q_DECL_CONST_FUNCTION { // note that [0x09..0x0d] + 0x85 are exceptional Cc-s and must be handled explicitly return ucs4 == 0x20 || (ucs4 <= 0x0d && ucs4 >= 0x09) || (ucs4 > 127 && (ucs4 == 0x85 || ucs4 == 0xa0 || QChar::isSpace_helper(ucs4))); } Code:
bool QT_FASTCALL QChar::isSpace_helper(uint ucs4) Q_DECL_NOTHROW { if (ucs4 > LastValidCodePoint) return false; const int test = FLAG(Separator_Space) | FLAG(Separator_Line) | FLAG(Separator_Paragraph); return FLAG(qGetProp(ucs4)->category) & test; } |
![]() |
![]() |
![]() |
#49 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: The Lambda
This one works, too:
Code:
auto skip_whiteys=[&text](int i) mutable ->void {while(text.at(i).isSpace()) ++i;}; ![]() tbc...? |
![]() |
![]() |
![]() |
#50 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,499
Karma: 5703586
Join Date: Nov 2009
Device: many
|
The only valid whitespace chars inside a tag are space, tab, newline, carriage return, vertical tab, and formfeed (not 100% sure of this but it is close). Simply set up a constant QString with the real whitespace chars according to the spec and use QString::contains to check for exactly the spec whitespace quite easily anyplace. There are many other "whitespace" chars that should never appear inside a tag and if they do they are part of an attribute value or something.
Hope this helps. KevinH Last edited by KevinH; 07-21-2016 at 06:22 PM. |
![]() |
![]() |
![]() |
#51 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,499
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Plus you have no bounds checking in your loop and it may walk off the end and segfault.
And lambda functions do not make code easily readable or supportable no matter if it is the rage or not. So stick with simple WHITESPACE.contains() and increment and bounds check to make it clear what is going on. The python code does this for exactly this reason. Last edited by KevinH; 07-21-2016 at 06:22 PM. |
![]() |
![]() |
![]() |
#52 | |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
Quote:
WHITESPACE.contains was my first take... I think isSpace is optimized better than this. but now [REALITY]... |
|
![]() |
![]() |
![]() |
#53 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,499
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Yes, if you want to optimize, since allowed xhtml whitespace chars are single byte low code point values you could check if char code < some cutoff value and if so use it as offset into fixed array/vector to check for whitespace. I am sure there are other tricks as well.
Fwiw string search for a single char is one of the most heavily optimized glibc routines for just this reason - for short strings, repeat the byte to look for n times and do a 64 bit xor against const string to simultaneously test for that value in multiple chars at once. There are many other approaches as well Last edited by KevinH; 07-21-2016 at 07:01 PM. |
![]() |
![]() |
![]() |
#54 |
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
In The Source: The Lambda, Cet obscur objet du désir
[/REALITY].
now it's Code:
auto skip_whiteys=[]()->void {while(text.at(pos).isSpace() && pos<len) ++pos;}; Code:
auto skip_whiteys=[] {while(text.at(pos).isSpace() && pos<len) ++pos;}; I have not checked yet if it works (should, theoretically, global, in context, variables should be available) , because I'm in the middle of rewrite and I even don't try to compile... so don't think it's final... But, eventually, I'll time lambda versus WHITESPACE.contains(). Let the better man win. I'm using named lambdas to make code more readable. Precisely that brought me to isSpace. I decided to trust the Qt guys: they did some real work about what the white space in QString is. EDIT: no, those lambdas are no go... I forgot.. tbc...? Last edited by varlog; 07-25-2016 at 06:18 PM. |
![]() |
![]() |
![]() |
#55 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,235
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#56 | |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,499
Karma: 5703586
Join Date: Nov 2009
Device: many
|
If you are just playing around for yourself, please do whatever you want. But if you hope to contribute a pull request or patch set to Sigil, please do not use lamda functions when no callback or functor is needed. Please make the code simple and readable and easy to support. Please check the spec on whitespace allowed inside html5 tags (it is not isSpace()) when creating your version of the quickparser code. (FYI the python code version of quickparser identifies the set of allowed whitespace chars inside tags)
Code density and art like design are not measure of success. Decades ago, I used to write numerical integration routines in APL, and you can do it easily in one line, but it was completely unreadable and unsupportable once written. APL was so cryptic it deserved to perish as a supported language for research. Hope this better explains what Sigil is looking for if you decide to contribute. KevinH Quote:
|
|
![]() |
![]() |
![]() |
#57 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,319
Karma: 78876004
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
Quote:
|
|
![]() |
![]() |
![]() |
#58 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,638
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
As I recall much beloved by actuaries, modellers, and quants before they were called that. BR Last edited by BetterRed; 07-26-2016 at 01:58 AM. |
![]() |
![]() |
![]() |
#59 |
Connoisseur
![]() Posts: 64
Karma: 10
Join Date: Sep 2015
Location: Yerevan, Armenia
Device: none
|
- There are no sub-tags xml:lang="und" and xml:lang="zxx" in your “multilanguage.epub” file. (https://www.w3.org/International/que...e#undetermined)
- Also, unfortunately, Sigil do not work at all with soft hyphens - instead of OpenOffice. As I understand, “soft hyphen” cannot be inserted into an aff-file as ignored char, so programs have to process with chars like soft hyphen etc. by itself... |
![]() |
![]() |
![]() |
#60 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Yet another new blog | Nate the great | Lounge | 0 | 05-01-2011 04:32 PM |
new to blog | pemmike | Introduce Yourself | 6 | 01-03-2011 05:39 AM |
Blog | AlexRupflin | Deutsches Forum | 10 | 12-24-2008 04:05 AM |
My first Blog....ever | AJ Starr | Introduce Yourself | 7 | 05-23-2008 02:55 AM |