04-01-2013, 11:18 PM | #1 |
Gregg Bell
Posts: 2,264
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
search question
Hey. I was wanting to search for id (as in 'the cop flashed him his ID') in Sigil. (In other "Find" things it will have something like 'find whole word only.' ) Anyway I couldn't find anything like that in sigil, so I got every id in every word. Eg. Braid, staid, etc. Any way to find the exact word? thanks!
|
04-02-2013, 12:34 AM | #2 |
A Hairy Wizard
Posts: 3,094
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Try caps only "ID" and/or put a space before: " ID"
|
04-02-2013, 12:39 AM | #3 |
Obsessively Dedicated...
Posts: 3,200
Karma: 34977896
Join Date: May 2011
Location: JAPAN (US expatriate)
Device: Sony PRS-T2, ADE on PC
|
I know the regex gurus will have some wonderful black magic to do this, but the simplest way I've found is to include the space before ID. As in, when you are in the find box, hit spacebar, then type ID. If it is always in caps, type it that way, and on the drop-down list, choose "case sensitive."
Now if you have several hundred words beginning with the letters "ID", you might have to wait for some regex magic. But meanwhile, you could use the "replace/find" to go through your document one word at a time, and if you strike one that isn't the ID you want to change, just click the "find " button to jump ahead. EDIT --- I see the Mighty Turtle beat me, I am a slow typist.... The Tortoise and the Hare? Last edited by GrannyGrump; 04-02-2013 at 01:01 AM. |
04-02-2013, 03:23 AM | #4 | |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
http://www.regular-expressions.info/wordboundaries.html This regex will match all cases of "ID": Code:
\bID\b Tools - Spellcheck - Spellcheck (Alt+Q) Once at the Spellcheck screen, you can put a checkbox in "Show All Words". Then feel free to find whatever word you are looking for in the list, OR search for it by using the "Filter" box at the top. You can then double click on the word in the list in order to jump to its position throughout the book. |
|
04-02-2013, 02:38 PM | #5 | ||
Gregg Bell
Posts: 2,264
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
thank you!
Thanks Dion. It worked great.
Quote:
Quote:
|
||
04-02-2013, 07:50 PM | #6 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I have zero idea why your typical word processor does not have spellcheck functionality anywhere near what is in Sigil currently. Yeah, I learned about \b from someone on these forums (I forget the user), but once I saw it in usage, it was genius. (Perhaps it was in the sticky: Regex Examples.) Quote:
I learned most of my stuff from the Regex Tutorial: http://www.regular-expressions.info/tutorial.html I have a big Regex collection that I use all the time, most notably: - A Sigil "group" to clean up all the ABBYY Finereader cruft - Swapping footnotes from superscript footnotes -> [#] format - Combining broken paragraphs (happens VERY often in OCR) - "Fixing" the TOC code from Sigil (auto changing the Sigil format to match my "toc" classes in my CSS). This is one that I use quite often to fix "en dashes" (See https://en.wikipedia.org/wiki/Dash#En_dash): Search: Code:
([0-9])-([0-9]) Code:
\1–\2 |
||
04-02-2013, 11:10 PM | #7 |
Obsessively Dedicated...
Posts: 3,200
Karma: 34977896
Join Date: May 2011
Location: JAPAN (US expatriate)
Device: Sony PRS-T2, ADE on PC
|
Tex2002ans, thank you for links and samples. Most helpful.
|
04-02-2013, 11:34 PM | #8 | |
Gregg Bell
Posts: 2,264
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
thanks Tex
Quote:
A couple of quick questions about the Sigil spell check. I haven't been able to add contractions, and I have a lot of them, to the default dictionary. Know of a way? And how do you find words with accents (not necessarily in Regex)? Words like cafe or elan or facade (with the funky thing on the bottom--I don't even know what's called.). Thanks for sharing all this great stuff. |
|
04-03-2013, 01:23 AM | #9 | |||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Quote:
https://www.mobileread.com/forums/sho...d.php?t=209174 Someone needs to be kind enough to create a Sigil group that everyone can use to easily convert them back and forth, and catch any common missing apostrophes. That would be extremely helpful. Also, I believe in 0.7.1 (?) the spellcheck dealing with words including apostrophes became broken again. If you "Ignore" the word it still says it is spelled wrong. I would just wait until this bug is fixed again like it was in 0.7.0. I forget the explanation that was given for the change to "fix apostrophes" (I believe it was to fix a certain foreign language). (If I remember correctly the explanation was given in one of those Sigil release topics). Quote:
I personally use two ways. I just use the Sigil spellcheck system to get some funky characters. OR I use Tools - Reports - "Characters in HTML Files". The "Characters in HTML Files" will show you every single character that is actually used in the files. I then search through this quickly for any funky ones. You can then double click and search through your EPUB to find all instances of it. For example, quite often OCR adds in double angle quotation marks « ». I can easily spot these in the HTML character list, and fix them up. Quote:
Allow me to Quote myself from that previous topic I helped you in: Quote:
The 'c' with a funky squiggly below it 'ç' is a cedilla (one of the great things of fileformat.info is you can type in any character, and get almost every derivation/symbol of it). For example, I just searched the letter 'c', and figured out what the squiggly was: http://www.fileformat.info/info/unic...preview=entity No problem, the goal is to make everyone better at making higher quality books more quickly, and giving everyone the skills for more thorough error checking. Last edited by Tex2002ans; 04-03-2013 at 02:03 AM. |
|||||
04-03-2013, 02:31 PM | #10 | |
Gregg Bell
Posts: 2,264
Karma: 3917588
Join Date: Jan 2013
Location: Itasca, Illinois
Device: Kindle Touch 7, Sony PRS300, Fire HD8 Tablet
|
thanks
Quote:
A couple of questions though. I'm in the final stages of proofreading a novel. I was about one third through it. Then, having fallen totally in love with your b\text\b Regex search tool I started experimenting with it and looking for various things, trying to really get a feel for how it might help me. One of the things I did was put in various punctuation marks, including a straight quotation mark ("), which I often inadvertently put in when editing. Well, I finished up last night and when I came to the mss. in the morning I saw one straight quotation mark right in the very beginning of the book. (As I recall in the code it was not bracketed. It was just plopped down next to the end bracket for Chapter One.) In that I was doing a final proofread this threw me. I of course wondered if the regex searching had added any other little things. I know you warned about regex deleting things, but can it add things as well? (I really can't think of any other possible way that quotation mark could have got there. And I have started proofreading again from the beginning and I'd say I'm about one-sixth through now and I have not seen any additional things that shouldn't be there.) And a follow-up question: Perhaps (if indeed Regex can add things) it would be wise to only use Regex in the beginning phases of cleaning a document up? And I'm also a little concerned about how and what it might delete. (Yes, it seems great but scary! And remember I'm just doing my own books--and really they're pretty clean to begin with. Maybe I should leave Regex to pros like you?) Thanks! |
|
04-03-2013, 03:43 PM | #11 | ||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
\btext\b should only be used if you want to find a SPECIFIC WORD. That regex tutorial I linked above uses this example sentence: Code:
This island is beautiful If you use the regex "\bis\b", you ONLY get the blue (the EXACT WORD "is"). In english, the "\b" in a regex means: In this location there is a space OR punctuation mark (!?,."'<> ......) OR pretty much any NON-WORD character at that position. Case 1: Code:
\bis\b (Match is above) Case 2: Code:
\bis
Code:
This island is beautiful Code:
is\b
Code:
This island is beautiful I can go through explaining a straight quote regex for you if you want. But I don't want you running around ruining your book! Quote:
Punctuation in Regex gets much uglier (you have to be very careful because many punctuation marks MEAN something in regex). Example of the most common ones: . = Any character + = More than 1 character * = More than 0 characters What most likely happened was by you inserting a punctuation mark, it completely changed the meaning of the regex, which began messing some things up. You better be saving lots of backups before running these regex, don't want you accidentally deleting sections and not being able to get it back. ALWAYS save an alternate copy before messing with things. Quote:
It is sort of like when you copy/paste commands that you find online to run things on the commandline. You should really KNOW EXACTLY what the command is telling your computer to do BEFORE you run the command. The command CAN be powerful enough to erase every single directory, but since you don't understand it at all, you just copy/paste and run it!!! As you can see, in Case 1, I ONLY get the exact word "is", in Case 2, I can get every single word that begins with "is", in Case 3, I can get every single word that ends with "is". The Regexes almost look exactly the same but they are wildly different. Quote:
If you need someone else to take a look at your book for you (I might be able to catch a few mistakes), feel free to send your book my way. Feel free to email me at (my username) @gmail.com |
||||
04-03-2013, 05:17 PM | #12 |
Sigil developer
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
|
Always remember that replacing is done in Code View and you can easily delete HTML code tags or attributes if you aren't careful.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Search and Replace Question | MacEvansCB | Conversion | 1 | 12-10-2011 02:19 PM |
Kindle search question | dragonflyjewels | Amazon Kindle | 5 | 07-26-2011 12:29 PM |
Search/Replace Question | seagull | Sigil | 22 | 03-21-2011 01:30 PM |
Format search question | karmalized | Calibre | 5 | 06-30-2010 12:57 PM |
A search question | ficbot | Calibre | 4 | 04-28-2010 09:20 AM |