Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-17-2017, 08:48 PM   #1
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
A new feature proposal: report of group saved searches

Hi

This is a feature I just proposed to be implemented on the Calibre editor, to no avail.

https://www.mobileread.com/forums/sh...d.php?t=284450

Using Sigil to perform a regex, we get a one line information report of this kind:
Code:
Matches found 51
or
Code:
Replacements made 51
My question is: would it be possible to perform on Sigil a group search and get a report of each regex which is member of this group? The user would read something like:

Code:
1. Matches found 51
2. Matches found 6215
3. Matches found 324
This kind of report would be far more intuitive than a cumulative figure like:
Code:
Matches found: 6590
which, for the user point of view at least, means nothing because it sums up pigs and oranges.

I join the regex group I use on every EPUB after each ODTImport conversion. Up to now, I do it one by one for lack of precise report.
Attached Files
File Type: zip 15mars.json.zip (1.2 KB, 107 views)
roger64 is offline   Reply With Quote
Old 03-18-2017, 12:14 PM   #2
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Patches are always welcome, but I'm afraid I'm on the same page as Kovid on this. If you want individual results, use individual searches. You (should) only group searches/replaces that you've tested extensively and trust to perform correctly

Last edited by DiapDealer; 03-18-2017 at 09:12 PM.
DiapDealer is offline   Reply With Quote
Old 03-18-2017, 12:24 PM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by DiapDealer View Post
Patches are always welcome. but I'm afraid I'm on the same page as Kovid on this. If you want individual results, use individual searches. You (should) only group searches/replaces that you've tested extensively and trust to perform correctly
Plus 1


That does not stop you from Grouping searches. Just don't execute the group unless you are sure the group is fully safe
theducks is online now   Reply With Quote
Old 03-18-2017, 07:57 PM   #4
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by DiapDealer View Post
You (should) only group searches/replaces that you've tested extensively and trust to perform correctly
Quote:
Originally Posted by theducks View Post
Just don't execute the group unless you are sure the group is fully safe
These statements seem obvious but are only partially sound. I used these regex for quite a long time, and I am confident they work well for my usual workflow.

But, this is not enough: each book is a world of its own. Sometimes a discrete defect in the book (nobody is perfect but my mother) may hinder one of these regex. By performing a blind* group search, I will not know it, blissfully follow on and fail to implement some feature...

This means that I can never be sure a group search will work 100% on a new book, even if I trust every single component of the group. That's why I avoid all group searches. So, it defeats for me the purpose of this nice feature.

Note: "blind" means for me without a detailed report.

Last edited by roger64; 03-18-2017 at 08:16 PM. Reason: plural
roger64 is offline   Reply With Quote
Old 03-18-2017, 08:05 PM   #5
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,069
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
My method:

With "Saved Searches" window open I "count all" for each of the sub-parts of my group. When I am satisfied - I "Replace All" with the group selected. It's only a few extra button clicks. Of course the bestest solution would be for a breakdown of the counts as you suggested, but unless/until that gets implemented this is a fairly simple method.

Cheers,
Turtle91 is offline   Reply With Quote
Old 03-18-2017, 08:25 PM   #6
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by Turtle91 View Post
.../... Of course the bestest solution would be for a breakdown of the counts as you suggested, but unless/until that gets implemented this is a fairly simple method.

Cheers,
Thanks for the tip and appreciated moral support. I was beginning to feel lonely.
roger64 is offline   Reply With Quote
Old 03-18-2017, 08:49 PM   #7
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
I don't worry about fixing 100% of something. I will find those when I proof my work.
I worry about BREAKING anything.

Some of my saved searches are really templates. The replace is fixed, but I tweak the search. Eg. I have a Baen-deDiv search (removes all those pesky divs that get nested 1 MORE for each chapter-file. I tune the class cor the current book I repeat (all) n times (usually the number of chapters) until the count is 0
theducks is online now   Reply With Quote
Old 03-18-2017, 09:26 PM   #8
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by theducks View Post

Some of my saved searches are really templates. The replace is fixed, but I tweak the search. Eg. I have a Baen-deDiv search (removes all those pesky divs that get nested 1 MORE for each chapter-file. I tune the class cor the current book I repeat (all) n times (usually the number of chapters) until the count is 0
This is another thing. Indeed when I "clean" some commercial books from their useless nested divs, I use the excellent TagMechanic tool from DiapDealer which provides me these templates.

The group search I would like to use is tuned for a specific conversion process and helps eradicate minor code defects, add typographical and other changes.

Last edited by roger64; 03-18-2017 at 09:30 PM.
roger64 is offline   Reply With Quote
Old 03-18-2017, 09:46 PM   #9
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by roger64 View Post
But, this is not enough: each book is a world of its own. Sometimes a discrete defect in the book (nobody is perfect but my mother) may hinder one of these regex. By performing a blind* group search, I will not know it, blissfully follow on and fail to implement some feature...
Which is exactly why I fail to grasp how seeing counts (either aggregate or individual) will help you know that something failed (or succeeded). They just tell you how many times something occurred. If, as you say, each book is a world of its own, then how can you possibly know what numbers (aggregate or individual) will look "right" (or "wrong" for that matter) for that particular book?

I guess a big part of the problem is that I simply don't do blind, book-wide Replace Alls. They're too dangerous with one search, let alone stacking a bunch of them. I step through Replaces one at a time. If there's too many to make doing that feasible, then my source sucks and I need to start with something closer to what my end goal is.

Life's too short to completely overhaul the source-code of entire books. Start from better source code, I say.

But as I said initially ... if someone wants to submit a patch or pull-request to implement this, I wouldn't be opposed. I'm just not willing to use the limited time I have to tinker with Sigil on this idea. Maybe someone else will.
DiapDealer is offline   Reply With Quote
Old 03-18-2017, 10:38 PM   #10
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by DiapDealer View Post
.../..Life's too short .../...
I agree. A group search with report would save time.

I will try to see if a coder friend can help.
roger64 is offline   Reply With Quote
Old 03-18-2017, 11:09 PM   #11
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by roger64 View Post
A group search with report would save time.
You still haven't explained how. Unless you already know what the numbers are "supposed" to be, how will seeing them (broken down or lumped together) speed things up?
DiapDealer is offline   Reply With Quote
Old 03-19-2017, 05:42 AM   #12
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by DiapDealer View Post
You still haven't explained how. Unless you already know what the numbers are "supposed" to be, how will seeing them (broken down or lumped together) speed things up?
Thank you for taking some interest in this question.

First, please have a look at all my regex (see above). I use them for each of my books, one after another after I exported to EPUB3 an odt file using the Sigil plugin ODTImport.

They are tuned to this specific workflow. I know beforehand that the book to be processed has, depending on the case, say about ten images, 100 endnotes, some tables, some superscript, some degrees, not to forget five or six regex for adding nnbsp everywhere according to French rules, one other regex is for stylesheet, and so on. So I know in advance which ones will yield a positive result, while for some others a zero can be correct. A look at the information line is enough to know if the regex has been processed. This is true for one regex.

When you have 15 regex, it just takes you 15 times more to perform this task. Is that complicated? A computer can process the whole group in no time because it's all about elementary computing, but for the reasons given above, unhappily I can't rely on the cumulative result and I miss a breakdown count.


Last edited by roger64; 03-19-2017 at 11:38 AM. Reason: miss
roger64 is offline   Reply With Quote
Old 03-20-2017, 08:10 PM   #13
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DiapDealer View Post
You still haven't explained how. Unless you already know what the numbers are "supposed" to be, how will seeing them (broken down or lumped together) speed things up?
I could see how roger64's recommendation could be helpful.

Here is a few of the use-cases I can think of where this would be helpful.

In almost every EPUB I mass convert footnotes from <sup>##</sup> form into [##] form. It would be nice to see something like:

Code:
Fix Footnote <sup>##</sup> -> [##]		102
Fix Endnote <sup>##</sup> -> [##]		100
Currently, if I ran the entire group, I would just get a "Replacements made: 202". If there was a mismatch between the two, then I know that there is an issue I need to look into. Maybe there was a footnote 99a OR an OCR error along the line.

I also have "Finereader Cleanup" group of saved searches to clean some cruft Finereader produces. Here are a few:

Split Double Footnote

Search: <sup>([0-9]+), ([0-9]+)</sup>
Replace: <sup>\1</sup><sup>,</sup><sup>\2</sup>

Fix Bold Smallcaps

Search: <span style="font-weight:bold;font-variant:small-caps;">
Replace: <span class="smallcaps">

Clean Italic &

Search: <span class="italics">&amp;</span>
Replace: &amp;

On the last book I worked on, if I run the entire group, it says "Replacements made: 1072". But if I run each Regex individually, and say "Count All", I would get a helpful breakdown like this:

Code:
Fix Italics 				403
Fix Bold 				6
Fix Bold/Italics 			110
Fix Smallcaps 				25
Fix Bold/Smallcaps			1
Clean Italic &				0
Split Double Footnote			0
Fix Finereader 12 Table Alignment	198
Clean Bold td				0
Clean Italics td			29
Clean td				298
Clean Table Headers			2
This could let me know of a potential issue to look out for in this specific EPUB.

For example, if there was 1 "Double Footnote", I know that I have to look more closely when creating footnote links back/forth OR it could have been an OCR error.

Or if I get a hit on "Clean Italic &" I know that I have to go looking more closely. 99% of the time an italic ampersand is either NOT italic OR Finereader just didn't like the specific font used OR it was an actual OCR error. In the very rare case though, the ampersand might have been smack dab in the middle of a book title and the italic spaces around it were missed:

Code:
<i>Hansel</i> <i>&amp;</i> <i>Gretel</i>
would accidentally be corrected to this:

Code:
<i>Hansel</i> &amp; <i>Gretel</i>
If I saw 1 hit, I would then know to go searching for it and change it to this:

Code:
<i>Hansel &amp; Gretel</i>
With one journal I worked on, I came up with a group of 25 Regexes (cleaning up stuff like dropcaps, normalizing code for figures/images/captions, converting the occasional theta image -> Θ. [...]).

Having a breakdown of the number of each fix would have also been helpful way back when:

"I know there are 10 articles and 10 dropcaps? 25 figures and 25 corrections? Good, now I don't have to look at it."

Last edited by Tex2002ans; 03-20-2017 at 08:31 PM.
Tex2002ans is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
About safety and report of group saved searches roger64 Editor 7 03-17-2017 03:17 AM
Saved Searches Window Divingduck Editor 10 10-07-2014 11:21 PM
Saved searches : suggestions Bertrand Editor 0 05-09-2014 05:58 AM
copy saved searches cybmole Calibre Companion 4 04-28-2014 07:20 AM
Where are searches saved? travger Calibre 2 08-26-2012 01:37 PM


All times are GMT -4. The time now is 12:51 PM.


MobileRead.com is a privately owned, operated and funded community.