Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-07-2012, 04:35 PM   #1
TheRealSteve
Member
TheRealSteve began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Sep 2012
Device: sony prs-t2
Next Misspelled Word

I have discovered a problem with the Next Misspelled Word function of Sigil, but I don't know what's causing it.

I am trying to clean up the Google-Books epub of Henry Adams's History of the United States to eliminate some distractions. Mainly these are spelling mistakes, scan errors, or words split in two. The split is only visible in Sigil in Code View, or in my Reader when the first part of the split comes at the end of a line. For example, here the word "learning" is split by code whose function, as a beginner, I don't understand:

On learn<a id="GBS.PA58.w.1.0.0"></a><span class="gtxt_body" id="para.70.1.0.box.248.234.1006.310.q.60">ing the sale of Louisiana [etc.]

To find these errors, I have been using Sigil's "Next Misspelled Word"-function, which generally works as I would expect. However, a few times now I have found that it will skip from one section to the next, even though there are still misspelled words farther down in the first section. Say that I am checking for errors in content-0020.xml, it will skip to content-0021.xml, even though there are still plenty of words underlined in red in content-0020.xml. Here's an example of where that happens:

<p class="gtxt_footnote" id="para.367.2.0.box.242.1770.999.79.q.40" style="text-indent:1em;"><sup>1</sup> Mémoire, etc., lu à l'Institut National le 15 Germinal, An v. (April 4, 1797).</p>


In section content-0020.xml the words "Mémoire, etc., lu à l'Institut" and "le" are all underlined in red. Clicking on the Next Misspelled Word button highlights each of them, one after the other, in blue - up to and including the word "à". If I click on the Next Misspelled Word button again, the spelling check skips to section content-0021.xml, ignoring "l'Institut" and "le", and a bunch of other red-underlined words farther down the page. If I insert the cursor after "l'Institut," the spelling check continues in content-0020.xml, instead of skipping to the next section - until, that is, I reach this line:

<p class="gtxt_footnote" id="para.384.2.0.box.226.1767.1003.72.q.50" style="text-indent:1em;"><sup>1</sup> Rapport à l'Empereur, 28 Brumaire, An xiii. (Nov. 19,1804); Archives des Aff. Étr. MSS.</p>

The words "à l'Empereur" are underlined in red. As before, "à" gets highlighted in blue, but when I click on the Next Misspelled Word button again, it skips to content-0021.xml, even though it hasn't reached the bottom of content-0020.xml yet. Again, if I insert the cursor after "l'Empereur," the spelling check continues in content-0020.xml, instead of skipping to the next section.
TheRealSteve is offline   Reply With Quote
Old 11-07-2012, 05:03 PM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,258
Karma: 6020309
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
<a id="GBS.PA58.w.1.0.0"></a><span class="gtxt_body" id="para.70.1.0.box.248.234.1006.310.q.60"> (and it is missing the closing </span> )

That is an anchor(point) FROM the footnote/Index to allow a return to the middle of the word?
While HTML visually correct, I would never expect a spell check to wade through such odd usage
theducks is offline   Reply With Quote
Old 11-07-2012, 05:33 PM   #3
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
Well that is pretty heavily tagged text, but the issue is also shown by just

<p>test à this wrd</p>

It finds à as misspelled but then won't move forward.
meme is offline   Reply With Quote
Old 11-07-2012, 06:23 PM   #4
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Fixed for 0.6.1.

The issue is any time it reaches a 1-character misspelled word it will not jump to any other misspelled words further on the page for the next check. And as stated the workaround for now is placing the cursor several characters after that 1-character misspelled word.
kiwidude is offline   Reply With Quote
Old 11-07-2012, 07:58 PM   #5
TheRealSteve
Member
TheRealSteve began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Sep 2012
Device: sony prs-t2
Quote:
Originally Posted by theducks View Post
<a id="GBS.PA58.w.1.0.0"></a><span class="gtxt_body" id="para.70.1.0.box.248.234.1006.310.q.60"> (and it is missing the closing </span> )
I just didn't quote enough to include it.

Quote:
That is an anchor(point) FROM the footnote/Index to allow a return to the middle of the word?
Thanks for the explanation.

Quote:
While HTML visually correct, I would never expect a spell check to wade through such odd usage
Here's a simpler example of this word-splitting. The word "strip" is split:

forest covered every portion, except here and there a str<a id="GBS.PA1.w.0.1.0.1"></a>ip of cultivated soil
TheRealSteve is offline   Reply With Quote
Old 11-07-2012, 07:59 PM   #6
TheRealSteve
Member
TheRealSteve began at the beginning.
 
Posts: 17
Karma: 10
Join Date: Sep 2012
Device: sony prs-t2
Thanks meme and kiwidude.
TheRealSteve is offline   Reply With Quote
Old 11-07-2012, 08:17 PM   #7
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,258
Karma: 6020309
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by TheRealSteve View Post


Here's a simpler example of this word-splitting. The word "strip" is split:

forest covered every portion, except here and there a str<a id="GBS.PA1.w.0.1.0.1"></a>ip of cultivated soil
That is one INSANE book designer
theducks is offline   Reply With Quote
Old 11-07-2012, 09:27 PM   #8
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,781
Karma: 5072196
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by TheRealSteve View Post
I just didn't quote enough to include it.



Thanks for the explanation.



Here's a simpler example of this word-splitting. The word "strip" is split:

forest covered every portion, except here and there a str<a id="GBS.PA1.w.0.1.0.1"></a>ip of cultivated soil
If the eBook reader supports searching or dictionaries it will never work on these kinds of entry. The id should never split a word.

Dale
DaleDe is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I have a misspelled genre tag mdietz39 Library Management 1 06-04-2012 09:13 PM
Romance Ebers, Georg: A Word, Only a Word. V1. 20 Mar 2009 crutledge IMP Books 0 03-20-2009 09:12 AM
Romance Ebers, Georg: A Word, Only a Word. V1. 20 Mar 2009 crutledge ePub Books 0 03-20-2009 09:09 AM


All times are GMT -4. The time now is 08:34 PM.


MobileRead.com is a privately owned, operated and funded community.