![]() |
#1 |
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Sep 2011
Location: Barrhead, Scotland
Device: Kindle Paperwhite (2)
|
Custom Search Parameters
Ok folks, question for the Devs or the super users of the application.
Is it possible to set up a custom search? To outline... When reading, I absolutely hate punctuation marks that are italicised (this is a personal quirk, so no discussion or comment required please). When editing my books, is there a way to have Sigil search for :- <span class="italic"> any punctuation mark </span> My expertise level of using Sigil is such that I only ever use the 'Find' box, and I've just started reading a title that has quite a few whole chapters (letters written by a character) that are fully in italics. It would save me an enormous amount of time if I could search in the manner I have described. Thanks in advance. |
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,439
Karma: 5702578
Join Date: Nov 2009
Device: many
|
It is called regular expression find and replace. And Sigil can do that already. Check out the sticky thread about regular expressions and of course there is help in the Sigil User's guide as well.
|
![]() |
![]() |
![]() |
#3 |
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Sep 2011
Location: Barrhead, Scotland
Device: Kindle Paperwhite (2)
|
Much obliged, Kevin.
I think I found one that suits my purpose <span class="italics">[^<]*\s.*</span> But... How do I use this? I've never done anything like it before. Can you point me in right direction, please. Last edited by Woodssi; 09-05-2023 at 03:34 PM. |
![]() |
![]() |
![]() |
#4 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Find: ([;’”,\)\]\.—])</span> Replace: </span>\1 What it does is:
So it would take something like this: Code:
This is a <span class="italics">A Book Title.</span> “What did you say to me you piece of <span class="italics">crap?”</span> Code:
This is a <span class="italics">A Book Title</span>. “What did you say to me you piece of <span class="italics">crap</span>?” Find all Italics Ending Punctuation Find: ([;’”,\)\]\.—])</i> Replace: </i>\1 or: Find all Italics Beginning Punctuation Find: <i>([‘“\(—]) Replace: \1<i> This will help correct things like: Code:
<i>“Example Book</i> by First Last was the greatest book <em>ever!”</em> Code:
“<i>Example Book</i> by First Last was the greatest book <em>ever</em>!” Side Note: For more info, see my posts in:
If you need even more regex cleanup tips, see my posts in: and if you need even more, type this into your favorite search engine: Code:
Tex2002ans regex site:mobileread.com Tex2002ans regular expression site:mobileread.com Last edited by Tex2002ans; 09-05-2023 at 07:56 PM. |
|
![]() |
![]() |
![]() |
#5 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,439
Karma: 5702578
Join Date: Nov 2009
Device: many
|
Check out the Find and Replace chapter in the Sigil User's Guide. There is also a Tutorial chapter on Advanced Find there as well.
You can download the Sigil User's guide as an epub from: https://github.com/Sigil-Ebook/sigil...guide/releases |
![]() |
![]() |
![]() |
#6 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,341
Karma: 203719142
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I find using the Unicode categories much easier/cleaner.
Any punctuation character can be matched with \p{P} \p{P} or \p{Punctuation}: any kind of punctuation character. \p{Pd} or \p{Dash_Punctuation}: any kind of hyphen or dash. \p{Ps} or \p{Open_Punctuation}: any kind of opening bracket. \p{Pe} or \p{Close_Punctuation}: any kind of closing bracket. \p{Pi} or \p{Initial_Punctuation}: any kind of opening quote. \p{Pf} or \p{Final_Punctuation}: any kind of closing quote. \p{Pc} or \p{Connector_Punctuation}: a punctuation character such as an underscore that connects words. \p{Po} or \p{Other_Punctuation}: any kind of punctuation character that is not a dash, bracket, quote or connector. |
![]() |
![]() |
![]() |
#7 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,953
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,341
Karma: 203719142
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Regex ain't magic.
|
![]() |
![]() |
![]() |
#9 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 517
Karma: 8500000
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe, Kindle 4 Touch
|
An iterative application of a regex search and replace could pull all punctuation outside of a certain tag, but not like his example, since there are commas both inside and outside of the tag in the end result.
That said, tags tend to have semantic value, so any fully automated system will definitely get it wrong at times. <em></em> used like quotes around a thought should follow the same rules as quotation marks, so punctuation goes inside. Emphasizing just a word or phrase shouldn't have the tags include starting and ending punctuation. Getting these right is by far the most time-consuming part of my fix of CSS from commercial eBooks. |
![]() |
![]() |
![]() |
#10 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,295
Karma: 20171067
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
|
![]() |
![]() |
![]() |
#11 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
![]() ![]() Quote:
Then you can use the fantastic functionality added in recent Sigil versions to get "Italic Lists", which can list everything between two:
From there, you can do whatever extra tweaks are needed. For more info on that, see my fantastic descriptions of the workflow back in:
That allows you to quickly list all HTML that matches your regex into a simple to understand/search/sort list. I described how that can be used to quickly map all:
or many other helpful "mass editing" workflows. ![]() The second you sort into a list, the huge ones with lots of punctuation will instantly stand out like a sore thumb: Code:
<i>Enciclopedia Italiana</i> <i>New York Times</i> <i>This sentence is very long? And has lots, and lots, and lots of punctuation inside?</i> <i>Wall Street Journal</i> <i>Washington Post</i> <i>individual</i> <i>laissez-faire</i> <i>negative</i> Last edited by Tex2002ans; 09-07-2023 at 08:10 PM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Set parameters on download metadata to only search for large covers | FacetiousKnave | Library Management | 8 | 12-10-2022 07:55 PM |
I have a problem with custom search. | wan1967 | Library Management | 4 | 11-04-2022 10:20 AM |
Search Custom Columns | Majik | Library Management | 2 | 12-10-2019 02:14 PM |
Search using custom column | macnab69 | Library Management | 4 | 05-19-2013 12:33 PM |
Feature Wish: Save Search Parameters | BookwormDragon | Calibre | 22 | 04-09-2010 05:31 AM |