Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 10-18-2011, 11:33 PM   #1
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Automatic entity conversion screwing up search and replacerch and replace

The older versions of Sigil didn't automatically convert entities. I'm not saying I'm against the idea of automatically converting entities, but the way it's currently implemented makes things very confusing when using search and replace.

I had a document that started as markdown, and converted -- to —. The em-dash was directly encoded, not an entity in the actual ePub.

I just went into Sigil to give all the paragraphs starting with em-dash negative indents. In Sigil code view the em-dash was automatically displayed as an entity. Search and Replace for the entity and surrounding pattern worked fine for the displayed html flow, but when I expanded the search to all html files, the search failed to find any more occurrences.

It was only when I changed the search expression to an actual em-dash that it then found all the instances in the document and replaced them - interestingly enough the actual em-dash search expression also found/replaced the entity em-dash text in the actively open xhtml file.

I did like that searching for the actual unicode character caught both encoded and non-encoded characters. However it's extremely unintuitive that searching for the entity didn't catch both versions, seeing as Sigil will never show you the unicode version.

Is this a bug or a known issue?

Edit - appreciate if a forum mod can fix the typo in my title, not sure how the extra text got there
ldolse is offline   Reply With Quote
Old 10-19-2011, 05:05 AM   #2
cuthbert19
Enthusiast
cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.cuthbert19 once ate a cherry pie in a record 7 seconds.
 
Posts: 48
Karma: 1916
Join Date: Sep 2010
Device: Cybook Opus
See here and here
cuthbert19 is offline   Reply With Quote
Advert
Old 10-19-2011, 01:18 PM   #3
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Thanks for the links - I did vaguely recall those threads but the search I ran didn't pull them up.

That said, neither link acknowledges whether this should be considered a bug, so curious for John or Charles' input. Like I said, I've got no problem with the decision for code view to translate the unicode character to an entity, but IMHO search and replace is broken since you can't search/replace that entity across all files (and unless you know about this issue and know what the actual unicode character is it's non-trivial to work around).
ldolse is offline   Reply With Quote
Old 10-19-2011, 01:55 PM   #4
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
This is going to depend on how the character is in the files themselves. S&R should never make a change itself behind your back.

When saving, shy, and the dashes are saved as entities. Unfortunately your file is a mix of both so your having issues across multiple files. This is not a bug.
user_none is offline   Reply With Quote
Old 10-19-2011, 03:43 PM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
From my perspective the issue is I didn't mix them. My source document was all unicode. Sigil decided to convert them to entities on a flow by flow basis, and created the mix. I don't think it's unreasonable for end users to write search and replace expressions against what Sigil shows them and expect them to work - with the current functionality you'll never find a unicode em/en dash because the act of rendering it in code view will change it to an entity. IMHO if Sigil is going to change stuff it should do it throughout the document so that it's uniform... ( based on your response it does make this uniform on file save, but what I'm saying is it should be made uniform on file open)

I personally won't run into the issue again because I'm now fully versed on the quirk, but this seems like a support issue to me, which is why I highlighted it.

Last edited by ldolse; 10-19-2011 at 03:51 PM.
ldolse is offline   Reply With Quote
Advert
Old 10-20-2011, 06:49 AM   #6
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Unfortunately this is a limitation of having the WYSIWYG editor. It convertes all entities to unicode and does not retain the necessary info on whether it started as a entity or the character.

The original issue was shy is a hidden character, and the dashes look similar so people couldn't tell them apart. The solution was to always convert these to entities when saving changed pages or in the code view.
user_none is offline   Reply With Quote
Old 01-04-2012, 03:07 AM   #7
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Hmm, I still have an issue with the ­. They don't appear in 0.4.0, 0.4.1 and the beta's as visible characters, neither in CV or BV. I do see them when I open the document in for example Notepad++.
Since they cause issues on my reader (the famous ?), I want to remove them. Since there are a lot of pages, I would rather use Sigil for this. Am I missing something here?

I can do a S&R by selecting the hidden character, but only solves half my issue. It removes them, but I would like to know that the characters are in the documents and not find out on my reader.

Last edited by Toxaris; 01-04-2012 at 03:09 AM.
Toxaris is offline   Reply With Quote
Old 01-04-2012, 09:46 AM   #8
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
I second that , trying to remove non-displayed hidden characters is a nightmare. some sort of 3rd view which displays raw codes ( unicode view??) would help

also the — does not seem to display correctly in code view ( in latest full release - have not tesded any betas). in book view it looks like ALT 0151 should look, but in code view it looks like a normal dash ?
cybmole is offline   Reply With Quote
Old 01-04-2012, 10:20 AM   #9
Rand Brittain
Bookmaker
Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.
 
Posts: 416
Karma: 2143650
Join Date: Sep 2010
Device: Cybook Opus
I agree that it's an issue. I frequently want to run search/replace to alter dashes—I much prefer "emdash" to "space-endash-space", but Sigil's searches often don't get them all.

Will the plan to eventually make Code View the primary mode of operation, with Book View as a kind of preview mode, one day mend this trouble? That will actually be a pretty sweet feature.
Rand Brittain is offline   Reply With Quote
Old 01-04-2012, 11:02 AM   #10
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,659
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
I might as well kick in my 2 cents.

I would love to have a tool to show the code for a single (first) character of selected text (on Status line)
The problem was the display font did not have support for the character (BV or CV)
So what character do I replace or embed a font that supports?
I had to resort to a old DOS Hex file editor to learn the (in this case UTF code)

I would not be unhappy if we could set a preferred Open view.
theducks is offline   Reply With Quote
Old 01-05-2012, 01:46 AM   #11
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
In older versions of Sigil (0.3) in code view it just said ­. If I open my file in Notepad++ it shows the special character. Is it possible to view or convert the entities in CodeView. Of all places I expect it to be seen in CV.
Toxaris is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems with 'search-and-replace' conversion in versions 0.8.21 and/or 0.8.22? GMRabelink Calibre 0 10-14-2011 03:38 PM
search and replace - drops blanks in replace ? cybmole Conversion 10 03-13-2011 03:07 AM
The entity name must immediately follow the '&' in the entity reference digireads Calibre 3 06-08-2010 10:31 PM
Search and replace in 0.2.0 paulpeer Sigil 7 03-13-2010 11:59 AM
Automatic replace old enews files, how? bthoven Calibre 4 11-08-2009 10:33 AM


All times are GMT -4. The time now is 12:41 AM.


MobileRead.com is a privately owned, operated and funded community.