Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-20-2016, 04:55 PM   #46
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: The Texas Chain Saw Massacre 2

All would be well on its way, if I only were not applying the chain saw to my foot... <script>!... <![CDATA[ ]]>!... if (i < codeArray.length) {...} indeed!

But we live in XXI century now and there is no escaping epub3.
And the effects are so cool... Just click on the [clicko me!] thing... the moment I saw it, I just had to implement it in my book! Those tags will be used.

I created my very first C++ lambda. Here it is, in all its glory:
Code:
    auto skip_whiteys=[&pos,&text]()->void
        {while(text.at(pos).isSpace()) ++pos;};
must be even taking care of locale...?
hope that Mac/Win compilers are up to it... we speak (linux) gcc 4.8.4 here.


tbc...?
varlog is offline   Reply With Quote
Old 07-20-2016, 07:33 PM   #47
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
be careful .isSpace() is a different set that what the xhtml spec and html5 spec clearly state as whitespace.
KevinH is online now   Reply With Quote
Advert
Old 07-21-2016, 04:38 PM   #48
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
Code:
    static Q_DECL_CONSTEXPR inline bool isSpace(uint ucs4) Q_DECL_NOTHROW Q_DECL_CONST_FUNCTION
    {
        // note that [0x09..0x0d] + 0x85 are exceptional Cc-s and must be handled explicitly
        return ucs4 == 0x20 || (ucs4 <= 0x0d && ucs4 >= 0x09)
                || (ucs4 > 127 && (ucs4 == 0x85 || ucs4 == 0xa0 || QChar::isSpace_helper(ucs4)));
}
and
Code:
bool QT_FASTCALL QChar::isSpace_helper(uint ucs4) Q_DECL_NOTHROW
	{
	    if (ucs4 > LastValidCodePoint)
	        return false;
	    const int test = FLAG(Separator_Space) |
	                     FLAG(Separator_Line) |
	                     FLAG(Separator_Paragraph);
	    return FLAG(qGetProp(ucs4)->category) & test;
	}
The ones I want are there, I would not expect the rest of them inside < tag >, which is where I use it. Where is the catch?
varlog is offline   Reply With Quote
Old 07-21-2016, 06:13 PM   #49
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: The Lambda

This one works, too:
Code:
    auto skip_whiteys=[&text](int i) mutable ->void
        {while(text.at(i).isSpace()) ++i;};
looks more impressive... I'll stay with it (for the moment )...


tbc...?
varlog is offline   Reply With Quote
Old 07-21-2016, 06:17 PM   #50
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
The only valid whitespace chars inside a tag are space, tab, newline, carriage return, vertical tab, and formfeed (not 100% sure of this but it is close). Simply set up a constant QString with the real whitespace chars according to the spec and use QString::contains to check for exactly the spec whitespace quite easily anyplace. There are many other "whitespace" chars that should never appear inside a tag and if they do they are part of an attribute value or something.

Hope this helps.

KevinH

Last edited by KevinH; 07-21-2016 at 06:22 PM.
KevinH is online now   Reply With Quote
Advert
Old 07-21-2016, 06:18 PM   #51
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
Plus you have no bounds checking in your loop and it may walk off the end and segfault.
And lambda functions do not make code easily readable or supportable no matter if it is the rage or not. So stick with simple WHITESPACE.contains() and increment and bounds check to make it clear what is going on. The python code does this for exactly this reason.

Last edited by KevinH; 07-21-2016 at 06:22 PM.
KevinH is online now   Reply With Quote
Old 07-21-2016, 06:39 PM   #52
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
Quote:
... bounds checking...
... now, that helps...
WHITESPACE.contains was my first take... I think isSpace is optimized better than this.
but now [REALITY]...
varlog is offline   Reply With Quote
Old 07-21-2016, 06:56 PM   #53
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
Yes, if you want to optimize, since allowed xhtml whitespace chars are single byte low code point values you could check if char code < some cutoff value and if so use it as offset into fixed array/vector to check for whitespace. I am sure there are other tricks as well.

Fwiw string search for a single char is one of the most heavily optimized glibc routines for just this reason - for short strings, repeat the byte to look for n times and do a 64 bit xor against const string to simultaneously test for that value in multiple chars at once. There are many other approaches as well

Last edited by KevinH; 07-21-2016 at 07:01 PM.
KevinH is online now   Reply With Quote
Old 07-25-2016, 05:53 PM   #54
varlog
actually it is /var/log
varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.varlog ought to be getting tired of karma fortunes by now.
 
varlog's Avatar
 
Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
In The Source: The Lambda, Cet obscur objet du désir

[/REALITY].

now it's
Code:
    auto skip_whiteys=[]()->void
        {while(text.at(pos).isSpace() && pos<len) ++pos;};
and will be:
Code:
   auto skip_whiteys=[]
        {while(text.at(pos).isSpace() && pos<len) ++pos;};
if all the compilers agree on it.
I have not checked yet if it works (should, theoretically, global, in context, variables should be available) , because I'm in the middle of rewrite and I even don't try to compile... so don't think it's final...

But, eventually, I'll time lambda versus WHITESPACE.contains(). Let the better man win.
I'm using named lambdas to make code more readable.

Quote:
Originally Posted by KevinH View Post
...(not 100% sure of this but it is close)...
Precisely that brought me to isSpace. I decided to trust the Qt guys: they did some real work about what the white space in QString is.

EDIT: no, those lambdas are no go...

I forgot..
tbc...?

Last edited by varlog; 07-25-2016 at 06:18 PM.
varlog is offline   Reply With Quote
Old 07-25-2016, 07:36 PM   #55
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,027
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by varlog View Post
The first one for me was the "Going Postal", as an audio book. I was immediately hooked.
But I agree with JSWolf: the proper reading order is this.
Thank you for that link. It's perfect!
JSWolf is offline   Reply With Quote
Old 07-25-2016, 09:38 PM   #56
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
If you are just playing around for yourself, please do whatever you want. But if you hope to contribute a pull request or patch set to Sigil, please do not use lamda functions when no callback or functor is needed. Please make the code simple and readable and easy to support. Please check the spec on whitespace allowed inside html5 tags (it is not isSpace()) when creating your version of the quickparser code. (FYI the python code version of quickparser identifies the set of allowed whitespace chars inside tags)

Code density and art like design are not measure of success. Decades ago, I used to write numerical integration routines in APL, and you can do it easily in one line, but it was completely unreadable and unsupportable once written. APL was so cryptic it deserved to perish as a supported language for research.

Hope this better explains what Sigil is looking for if you decide to contribute.

KevinH



Quote:
Originally Posted by varlog View Post
[/REALITY].

now it's
Code:
    auto skip_whiteys=[]()->void
        {while(text.at(pos).isSpace() && pos<len) ++pos;};
and will be:
Code:
   auto skip_whiteys=[]
        {while(text.at(pos).isSpace() && pos<len) ++pos;};
if all the compilers agree on it.
I have not checked yet if it works (should, theoretically, global, in context, variables should be available) , because I'm in the middle of rewrite and I even don't try to compile... so don't think it's final...

But, eventually, I'll time lambda versus WHITESPACE.contains(). Let the better man win.
I'm using named lambdas to make code more readable.



Precisely that brought me to isSpace. I decided to trust the Qt guys: they did some real work about what the white space in QString is.

EDIT: no, those lambdas are no go...

I forgot..
tbc...?
KevinH is online now   Reply With Quote
Old 07-25-2016, 10:33 PM   #57
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,170
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
Quote:
Originally Posted by KevinH View Post

Code density and art like design are not measure of success. Decades ago, I used to write numerical integration routines in APL, and you can do it easily in one line, but it was completely unreadable and unsupportable once written. APL was so cryptic it deserved to perish as a supported language for research.
I think I was warped for life by learning APL as my first programing language in grade 8, and only using APL until the start of Grade 13!
PeterT is offline   Reply With Quote
Old 07-26-2016, 01:56 AM   #58
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,579
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none


As I recall much beloved by actuaries, modellers, and quants before they were called that.

BR

Last edited by BetterRed; 07-26-2016 at 01:58 AM.
BetterRed is offline   Reply With Quote
Old 07-26-2016, 02:59 AM   #59
brolny
Connoisseur
brolny began at the beginning.
 
Posts: 64
Karma: 10
Join Date: Sep 2015
Location: Yerevan, Armenia
Device: none
- There are no sub-tags xml:lang="und" and xml:lang="zxx" in your “multilanguage.epub” file. (https://www.w3.org/International/que...e#undetermined)

- Also, unfortunately, Sigil do not work at all with soft hyphens - instead of OpenOffice. As I understand, “soft hyphen” cannot be inserted into an aff-file as ignored char, so programs have to process with chars like soft hyphen etc. by itself...
brolny is offline   Reply With Quote
Old 07-26-2016, 07:31 AM   #60
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by brolny View Post
- There are no sub-tags xml:lang="und" and xml:lang="zxx" in your “multilanguage.epub” file. (https://www.w3.org/International/que...e#undetermined)

- Also, unfortunately, Sigil do not work at all with soft hyphens - instead of OpenOffice. As I understand, “soft hyphen” cannot be inserted into an aff-file as ignored char, so programs have to process with chars like soft hyphen etc. by itself...
This is actually a different discussion, but most ePUB readers are not able to handle soft hyphens. At least the search function does not work properly anymore. Also Sigil works fine with soft hyphens, only the spelling check doesn't. That is a difference.
Toxaris is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Yet another new blog Nate the great Lounge 0 05-01-2011 04:32 PM
new to blog pemmike Introduce Yourself 6 01-03-2011 05:39 AM
Blog AlexRupflin Deutsches Forum 10 12-24-2008 04:05 AM
My first Blog....ever AJ Starr Introduce Yourself 7 05-23-2008 02:55 AM


All times are GMT -4. The time now is 09:59 PM.


MobileRead.com is a privately owned, operated and funded community.