Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 06-16-2012, 05:14 PM   #76
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
It should, of course, be optional -- and consistent, no mixing. Often, in the past, I have had the impression that Sigil is abstracting extended characters somehow, which helped make regex unstable.

I prefer the named entities, myself -- I like to easily distinguish between ' and ‘/’, for instance, or hyphen-–-—, •-·, etc. Of course, there are characters, but visually difficult. Regex is no more difficult for these than for characters; don't have to open Character Map or type ALT-NUMPAD codes, so it might be easier.

I guess my take is that, for me, if it's not on the keyboard, it should be an entity. And there are even a few on the keyboard that make life easier for me. (>, <, ', ˜, & # 96 ; ,[forum is eating the numeric entity for the grave accent (backtick)! which has no named entity, sadly], etc.)

This comes in large part from dealing with badly-formed source files, and slowly working via regex to get them consistent throughout. The named entities are emphatically expressive of content, not leaving it up to visual interpretation on my part.

Also, generally, the ereaders have a method of expressing most entities -- but the characters are more problematic, leading to ugly replacements or errors.

My 2 ¢

Aloha,

Last edited by capidamonte; 06-16-2012 at 05:24 PM. Reason: grave accent eaten by forum
capidamonte is offline  
Old 06-16-2012, 05:49 PM   #77
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by capidamonte View Post
Regex is no more difficult for these than for characters; don't have to open Character Map or type ALT-NUMPAD codes, so it might be easier.
It's possible you misunderstand me when it comes to "making my regex more difficult." Mainly, I mean that I already have quite an extensive personal collection of specific regexps that I don't want to have to overhaul. And besides my fingers have the ALT-NUMPAD codes down cold. Second nature. I want to use them.

I also use a lot of the unicode regex classes: \p{P} doesn't know what html entities are and won't match them. Neither will \p{Pd} or my favorite... \p{Po}. My custom tailored regexps are polluted with unicode classes like that.

I guess I don't understand why this even has to be an issue. People should be able to make their own decision with regard to entity vs character. That's the way 0.5.3 works for me: if I enter the mdash entity it stays an entity... if I enter the mdash character it stays a character. Beautiful.

Last edited by DiapDealer; 06-16-2012 at 09:30 PM.
DiapDealer is online now  
Advert
Old 06-16-2012, 08:32 PM   #78
capidamonte
Not who you think I am...
capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.capidamonte can even cheer up an android equipped with a defective Genuine Personality Prototype.
 
capidamonte's Avatar
 
Posts: 374
Karma: 30283
Join Date: Jan 2010
Location: Honolulu
Device: PocketBook 360 -- Ivory
More or less agreed, pal. I have a full set, myself. I could probably stand to learn more unicode regex, honestly.

I think I jumped in here because I'm always afraid that things are going to go away that I use. I find that a lot of folks prefer to think about stuff that I prefer to just perceive, like named entities. I suspect that you perceive the characters themselves more clearly than I do.

Back to regularly scheduled discussion.

Aloha,
capidamonte is offline  
Old 06-16-2012, 09:40 PM   #79
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
In regard to entities. Sigil has since the 0.4 series replaced em, en, and shy with entities. All other entities would be replaced with unicode characters due to how BV worked. Now the only automatic replacement is em, en and shy. Everything else is now left as is.

The above three entities were chosen for replacement for key reasons. em and en look so similar that it makes it easier to differentiate. shy, well you can't see it so you don't know if it's there or not.

Also, a new beta will be available once I get the unicode filename saving ironed out. Minizip is not very easy to understand.

Last edited by user_none; 06-16-2012 at 09:43 PM.
user_none is offline  
Old 06-17-2012, 03:21 AM   #80
Ahmad Samir
Zealot
Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!Ahmad Samir , Klaatu Barada Niktu!
 
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
Quote:
Originally Posted by DiapDealer View Post
Is it just me, or did 0.5.901 just silently replace all of my em-dash characters with its html-entity equivalent?

EDIT: Yes... yes, it surely did.
I've been using this patch for some time now:
Code:
--- sigil-0.5.0/src/Sigil/ResourceObjects/HTMLResource.cpp.orig	2012-02-02 04:00:34.000000000 +0200
+++ sigil-0.5.0/src/Sigil/ResourceObjects/HTMLResource.cpp	2012-02-02 06:43:11.293174051 +0200
@@ -473,8 +473,8 @@
     QString newsource = source;
 
     newsource = newsource.replace( QString::fromUtf8( "\u00ad" ), "­" );
-    newsource = newsource.replace( QString::fromUtf8( "\u2014" ), "—" );
-    newsource = newsource.replace( QString::fromUtf8( "\u2013" ), "–" );
+    newsource = newsource.replace( "—", QString::fromUtf8( "\u2014" ) );
+    newsource = newsource.replace( "–", QString::fromUtf8( "\u2013" ) );
 
     return newsource;
 }
Ahmad Samir is offline  
Advert
Old 06-17-2012, 04:09 AM   #81
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I had created an issue (316) which is now reported as "fixed", but maybe it can be reopened.

In any case,   is also something you probably want as an entity.
Jellby is offline  
Old 06-17-2012, 05:34 AM   #82
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by user_none View Post
In regard to entities. Sigil has since the 0.4 series replaced em, en, and shy with entities. All other entities would be replaced with unicode characters due to how BV worked.
I don't quite understand... in a stock 0.5.3 binary install (Windows and Linux), my EMs don't get replaced at all. Entity stays entity, character stays character.

Either way, I still think entity vs character should ultimately be an end user decision. Just two cents worth of whatever.

Quote:
Originally Posted by Ahmad Samir
I've been using this patch for some time now:
Thanks for the patch, Ahmed! That still looks applicable to the beta code. I'll play with that a bit.
EDIT: Works a treat! I chose to leave the source "as is" with the exception of the shy, zwsp, zwnj, zwj, and thinsp characters. I make sure those are all converted to some sort of visible entity

Last edited by DiapDealer; 06-17-2012 at 09:05 AM. Reason: typo
DiapDealer is online now  
Old 06-17-2012, 07:41 AM   #83
Zeypxi
Member
Zeypxi began at the beginning.
 
Posts: 23
Karma: 10
Join Date: Apr 2011
Device: none
Hello user_none and meme, do you think that something can be done for the spell checking problem in French for 0.6 version? Indeed if one uses ' (straight apostrophes), spell check works properly but as soon as one uses (curly apostrophes), spell check makes false positive errors. Cheers.
Zeypxi is offline  
Old 06-17-2012, 09:06 AM   #84
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by DiapDealer View Post
Either way, I still think entity vs character should ultimately be an end user decision. Just two cents worth of whatever.
+1

IMHO, entities for otherwise invisible characters are fine, but mandatory entities for dashes are not, since those who use em dashes and en dashes usually can tell them apart from each other and hyphens.
Doitsu is offline  
Old 06-17-2012, 09:21 AM   #85
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Sigil 4.0 beta

I have now tried 4.0 beta now on three machines running Win 7. This version does not load HTML files. The OS reports that the program has failed.

Has anyone else tried Win 7? Have I done something dumb.
crutledge is offline  
Old 06-17-2012, 10:14 AM   #86
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
Quote:
Originally Posted by crutledge View Post
I have now tried 4.0 beta now on three machines running Win 7. This version does not load HTML files. The OS reports that the program has failed.

Has anyone else tried Win 7? Have I done something dumb.
When you say load HTML files - how are you loading them - Open, Add Existing, drag and drop, etc.?
meme is offline  
Old 06-17-2012, 10:18 AM   #87
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
Quote:
Originally Posted by Zeypxi View Post
Hello user_none and meme, do you think that something can be done for the spell checking problem in French for 0.6 version? Indeed if one uses ' (straight apostrophes), spell check works properly but as soon as one uses (curly apostrophes), spell check makes false positive errors. Cheers.
Not in 0.6.0. There is a ticket open about handling word boundaries for non-English languages. Its something that I want to look into for a future version since we re-wrote the spell check code, but I would need a clear list of items for various languages - French, German, Spanish in particular. I also need to take a closer look at the English parsing - so if anyone is using the spell check and you see it picking up words it shouldn't because of word boundary issues let me know. (I know there are a few oddities with upper/lower case and apostrophes that need to be looked at eventually).
meme is offline  
Old 06-17-2012, 10:21 AM   #88
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Quote:
Originally Posted by meme View Post
When you say load HTML files - how are you loading them - Open, Add Existing, drag and drop, etc.?
All of the above.
crutledge is offline  
Old 06-17-2012, 10:51 AM   #89
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
The beta works fine on x32 XP machines. Were your Windows 7 machines all 64 bit systems?
Doitsu is offline  
Old 06-17-2012, 11:15 AM   #90
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Quote:
Originally Posted by Doitsu View Post
The beta works fine on x32 XP machines. Were your Windows 7 machines all 64 bit systems?
64 bit
crutledge is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
0.4.903 (0.5 beta) Avaliable user_none Sigil 77 01-03-2012 09:24 PM
0.4.902 (0.5 beta) Avaliable user_none Sigil 65 12-18-2011 11:58 AM
No Avaliable format ? ? ? Janette55 Library Management 5 04-16-2011 04:09 PM
901 reymund PocketBook 3 12-16-2010 07:09 PM


All times are GMT -4. The time now is 06:22 PM.


MobileRead.com is a privately owned, operated and funded community.