Thread: search question
View Single Post
Old 04-03-2013, 01:23 AM   #9
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Gregg Bell View Post
Tex, Thanks for all the links and explanations and the warning about deleting data (I can sense how powerful Regex is).
But it is great for catching up errors that are impossible to catch using normal search (like missing closing quotation marks), or cleaning up code cruft (whenever I run into Calibre code I turn into a Regex rage).

Quote:
Originally Posted by Gregg Bell View Post
A couple of quick questions about the Sigil spell check. I haven't been able to add contractions, and I have a lot of them, to the default dictionary. Know of a way?
There was a topic a few weeks ago about contractions:

https://www.mobileread.com/forums/sho...d.php?t=209174

Someone needs to be kind enough to create a Sigil group that everyone can use to easily convert them back and forth, and catch any common missing apostrophes. That would be extremely helpful.

Also, I believe in 0.7.1 (?) the spellcheck dealing with words including apostrophes became broken again. If you "Ignore" the word it still says it is spelled wrong. I would just wait until this bug is fixed again like it was in 0.7.0.

I forget the explanation that was given for the change to "fix apostrophes" (I believe it was to fix a certain foreign language). (If I remember correctly the explanation was given in one of those Sigil release topics).

Quote:
Originally Posted by Gregg Bell View Post
And how do you find words with accents (not necessarily in Regex)?
There are lots of ways to do this. (I can give you the Regex method if you want, although Unicode Regex can get a little ugly (and I don't mess around with it too much so the Regexes will not be time tested by me)).

I personally use two ways. I just use the Sigil spellcheck system to get some funky characters.

OR

I use Tools - Reports - "Characters in HTML Files". The "Characters in HTML Files" will show you every single character that is actually used in the files. I then search through this quickly for any funky ones. You can then double click and search through your EPUB to find all instances of it. For example, quite often OCR adds in double angle quotation marks « ». I can easily spot these in the HTML character list, and fix them up.

Quote:
Originally Posted by Gregg Bell View Post
Words like cafe or elan or facade (with the funky thing on the bottom--I don't even know what's called.).
Since most of my work is done from the actual PDF scans, I usually am the one inserting all the foreign characters as I find them in ABBYY Finereader. Most of the time I just look up and copy/paste from Wikipedia/Fileformat.info.

Allow me to Quote myself from that previous topic I helped you in:

Quote:
Originally Posted by Tex2002ans View Post
There are also very nice lists of characters with accents. I constantly keep tabs open in Firefox for the Wikipedia pages for Macron, Grave accent, Acute accent, Diaresis, Circumflex, Caron, Dagger.


The 'c' with a funky squiggly below it 'ç' is a cedilla (one of the great things of fileformat.info is you can type in any character, and get almost every derivation/symbol of it). For example, I just searched the letter 'c', and figured out what the squiggly was:

http://www.fileformat.info/info/unic...preview=entity

Quote:
Originally Posted by Gregg Bell View Post
Thanks for sharing all this great stuff.
No problem, the goal is to make everyone better at making higher quality books more quickly, and giving everyone the skills for more thorough error checking.

Last edited by Tex2002ans; 04-03-2013 at 02:03 AM.
Tex2002ans is offline   Reply With Quote