MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Search and Replace (https://www.mobileread.com/forums/showthread.php?t=336350)

Ashjuk 01-09-2021 05:35 AM

Search and Replace
 
I am having a few issues with search and replace that I can't seem to find an answer to.

How do I make S&R case sensitive?

How do I make S&R find and replace whole words only?

I have done a number of S&Rs on a whole book and have found that it appears to not only replace words with the same case but those of the opposite case too - i.e. a search for color to replace with colour will also replace Color with colour.

Another issue I have come across is that if the search term I am looking for - as a whole word - is embedded in another word that appears to get changed too.

I have looked in preferences and can't seem to see anything that relates to this.

Tex2002ans 01-09-2021 08:17 AM

1 Attachment(s)
Quote:

Originally Posted by Ashjuk (Post 4080013)
How do I make S&R case sensitive?

After you press Ctrl+F, you should be able to change Mode to "Case Sensitive".

Quote:

Originally Posted by Ashjuk (Post 4080013)
How do I make S&R find and replace whole words only?

If you change Mode to "Regex", you can use Regular Expressions.

In Regular Expressions, \b = a word boundary... so if you wanted to search for a whole word:

Search: \bcolor\b
Replace: colour

Note: Regular Expressions are already case sensitive, so that won't match "Color" or "Colors". And since it's whole words, it won't match "colorimeter" or "technicolor".

Alternate Method

Your best bet will probably be Sigil's fantastic Tools > Spellcheck > Spellcheck.

This allows you to search through all words in a book in an easy-to-read list.

In there, you're also able to select a word, then "Change Selected Word to":

Attachment 184638

Ashjuk 01-09-2021 08:50 AM

Thanks Tex2002ans - that seems to be just what I was looking for.

Ashjuk 01-09-2021 09:27 AM

1 Attachment(s)
I have tried the Spellcheck option as suggested by Tex2002ans and I get the following result -
https://www.mobileread.com/forums/at...1&d=1610198353

The language has defaulted to US English (the book was published in the US) so it is now picking up words that I have already corrected to UK spelling as being misspelled.
I have tried clicking on the Language column to see if I can change it to UK English so that it spellchecks against the default UK dictionary, but there appears to no way to do that.

What am I missing?

BeckyEbook 01-09-2021 09:47 AM

How is Sigil to know a word is in British and not American?
If the main language is set to US, individual phrases in British must be clearly marked:
Code:

<span xml:lang="en-GB">colour</span>
Maybe it seems difficult, but if I have a whole book in Polish, and only a few dialogues in French, Italian or Spanish, it is obvious for me to mark the correct languages in these specific places. Then not only the spell check in Sigil works correctly, but also – what is equally important – appropriate readers can distinguish the language and for visually impaired people such expressions are read in correct languages by correct voices, and additionally words in foreign languages are hyphenated according to dictionaries assigned to these languages, not the main language in the e-book.

Ashjuk 01-09-2021 10:16 AM

Sorry Becky, I'm not sure I understand you.

The book was written with US spellings and all I want to do is spellcheck it against the default UK dictionary.

The spellcheck pop-up has spotted that it's a US book -as the language column has been set to that - so it's telling me that UK spellings are wrong.

All it appears to be doing at the moment is highlight words that it considers are misspelt in American and NOT those that are misspelt in UK English because it's checking against a US dictionary - which I don't have set as my default.

Oh, and the language in the opf is - <dc:language>en</dc:language>

KevinH 01-09-2021 11:30 AM

In Sigil Preferences: set your Primary Spellcheck dictionary to English-Great Britain.

Then try spellchecking again. If there are still issues, then look for lang or xml:lang attributes set to en-US someplace in the html.

Ashjuk 01-09-2021 11:50 AM

Thanks Kevin,

My default dictionary was already set to English - Great Britain, but your second suggestion was the problem as the headers all had lang="en-US" xml:lang="en-US in them.

I have removed the -US from them and Spellcheck now works as it should and only picks up UK misspelled words.

I will remember to look for that in the future.

isaacbh 02-25-2021 04:39 AM

Quote:

Originally Posted by BeckyEbook (Post 4080082)
How is Sigil to know a word is in British and not American?
If the main language is set to US, individual phrases in British must be clearly marked:
Code:

<span xml:lang="en-GB">colour</span>

Should I set both lang and xml:lang in a span? This is for epub 2.

BeckyEbook 02-25-2021 05:47 AM

Although it does not make sense, entering both will not hurt, and may help the disabled. It is important to correctly declare the correct language. Why? Well, the disabled use various software to read the text. The software is different – one program recognizes "lang", the next "xml:lang", and another recognizes both.

So typing both IMHO is not a bad idea.

But I know that there are people here who have the opposite opinion.

http://kb.daisy.org/publishing/docs/html/lang.html

Tex2002ans 02-25-2021 12:17 PM

Quote:

Originally Posted by Ashjuk (Post 4080124)
I have removed the -US from them and Spellcheck now works as it should and only picks up UK misspelled words.

I will remember to look for that in the future.

:thumbsup:

Quote:

Originally Posted by isaacbh (Post 4096956)
Should I set both lang and xml:lang in a span? This is for epub 2.

Yes.

If you only have one set, you can add the other using regex.

1. If you only have lang:

Code:

<span lang="en-GB">colour</span>
Search: <span lang="([^"]+)">
Replace: <span lang="\1" xml:lang="\1">

2. If you only have xml:lang:

Code:

<span xml:lang="en-GB">colour</span>
Search: <span xml:lang="([^"]+)">
Replace: <span lang="\1" xml:lang="\1">

Both will give you:

Code:

<span lang="en-GB" xml:lang="en-GB">colour</span>
Side Note: Although like BeckyEbook, lang is extremely helpful when dealing with completely different languages (Italian, Spanish) within an English book.

In your case, it would probably be better to mark the entire book + each chapter as "en" (English).

This would handle both American + British spellings, and then you can set your dictionary to handle which red squigglies you want to see.

(Obviously, you'd go with the superior American dictionary/spellings!!! :D)

Related Side Note: And IF you're going around marking all "foreign words" within a book, back in 2019 I wrote:

"Is there a way to use the selection in a Saved Search?" (Post #29)

All you'd have to do is change my <i> into <span>. :)

The instructions were for Calibre, but the same exact steps should apply in Sigil too.

Quote:

Originally Posted by BeckyEbook (Post 4096984)
Although it does not make sense, entering both will not hurt, and may help the disabled.

Entering both is good practice.

lang = HTML
xml:lang = XML

Most tools I've seen handle both, but there could be tools that only parse one or the other. (For example, a purely XML program might only understand xml:lang.)


All times are GMT -4. The time now is 09:24 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.