Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 05-10-2025, 06:52 PM   #16
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,062
Karma: 6361556
Join Date: Nov 2009
Device: many
So run two passes for two variants and only check the other replacements you want.
KevinH is offline   Reply With Quote
Old 05-10-2025, 09:07 PM   #17
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 500
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Sorry, KevinH, not sure what you mean. Why two variants? The patterns i'm referring to are not nearly so limited in their variance.

For example, a common OCR error is to insert a capitalized letter in the middle of a word. This error does not have 1, 2 or 3 consistent replacement values. Not only are there multiple Capital Letter variances, but there are multiple Replacement variances for any given error. For example, a capital "I" might be actually represent, a "t", or an "i", or an "l", or an "h" (when adjacent to a lower case "i") or... or... or... And for a capital "T" well the candidate replacement values might be.... and so on and so forth.

Running a separate search for each conceivable variant is simply unsustainable. Sadly there are a still some things that can only be partially automated. Human eyeballs are still a necessary part of the process.
ElMiko is offline   Reply With Quote
Advert
Old 05-11-2025, 12:07 AM   #18
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 47,869
Karma: 174315098
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
I suspect that Sigil is not the best tool for locating and correcting what appear to be OCR errors.

I also tested your complaint by looking for single digit chapter number which occurs in chapters 1 to 9. I then changed the search to look for double digit chapter numbers by modifying the search term using regex. I tested ~15 epubs and no issues.

Chapter [0-9]" found chapters 1 to 9, changing the search term to Chapter [0-9][0-9]" found chapters 10 to the last chapter in the book. I then switched to a normal search for a single word. Count showed multiple results and the search stepped though all the files that had the search term.

Using a normal search, I found the same results, changing the search term effectively hit the restart search button.

I could not duplicate your search not searching the entire document after changing the search term issue. Oddly, unchecking the wrap box didn't stop the search at the last file either in most cases.

This is using a compile of the latest Sigil code from the repository but I went back through the Pull Requests and didn't find anything that I could see as causing the effect you are seeing.

Last edited by DNSB; 05-11-2025 at 12:18 AM.
DNSB is offline   Reply With Quote
Old 05-11-2025, 12:50 AM   #19
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 500
Karma: 65460
Join Date: Jun 2011
Device: Kindle
I am also struggling to reproduce it with a custom test file. All I can say is that I've run into the issue multiple times on live files. Indeed, on EVERY file since i started this little experimental dive into 7.4.2.

I can also say that opening the first file in the document, closing all other files, and hitting Restart on the search eliminates the issue, hence my original question about a keyboard shortcut. While I don't know the exact conditions for replicating the problem, I do know the sufficient condition for resolving it.

EDIT: RE: Sigil's utility as an OCR correction application, I certainly haven't found a better one. OCR errors are highly compatible with RegEx for identification and, in many instances, correction.

EDIT 2: Figured out how to replicate. See post #21 below

Last edited by ElMiko; 05-11-2025 at 04:35 AM.
ElMiko is offline   Reply With Quote
Old 05-11-2025, 03:26 AM   #20
philja
Addict
philja will become famous soon enoughphilja will become famous soon enoughphilja will become famous soon enoughphilja will become famous soon enoughphilja will become famous soon enoughphilja will become famous soon enough
 
Posts: 304
Karma: 516
Join Date: Nov 2015
Location: Europe EEC
Device: Kindle Fire HD6 & HD8
Quote:
Originally Posted by DNSB View Post
Oddly, unchecking the wrap box didn't stop the search at the last file either in most cases.
As per the User Guide, "Wrap does not apply when searching multiple files."

Last edited by philja; 05-11-2025 at 03:28 AM.
philja is offline   Reply With Quote
Advert
Old 05-11-2025, 04:27 AM   #21
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 500
Karma: 65460
Join Date: Jun 2011
Device: Kindle
I can't begin to tell you what a nightmare it was reproducing it. Replicating bugs is an exercise in gaslighting... but I finally got it.

This is the order of operations:
  1. Open attached epub file (opens automatically to "01.xhtml")
  2. Open in another tab "02.xhtml"
  3. In "02.xhtml" type "hello" in Search field and Count All (i use the keyboard shortcut); you should get 12 matches
  4. In same tab type "bye" in Search field and Count all (again, using keyboard shortcut); you should get 8 matches
  5. Save (Ctrl+S)
  6. Switch tabs to the already open "01.xhtml" and close all other tabs (Ctrl+Alt+W)
  7. Now, find next instance of "bye" (Find Next keyboard shortcut or just click the Search Icon).

It will say "End of search" despite previously having counted 8 instances of "bye" and your never having actually cycled through them.

It seems that the "Save" step is what is stopping the search from jumping to results after that last opened file despite counting them... but I don't know.
Attached Files
File Type: epub test.epub (7.8 KB, 52 views)

Last edited by ElMiko; 05-11-2025 at 04:49 AM.
ElMiko is offline   Reply With Quote
Old 05-11-2025, 09:15 AM   #22
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,062
Karma: 6361556
Join Date: Nov 2009
Device: many
You started a Search across all files in the second file so it will only ends in when it has finished the first file. That is the search state information that was saved when you started the search to "bye". This state is what allows whole groups of Saved Searches to operate one after the other without revisiting earlier already found locations.

In your case, you closed that chapter and moved to the spot where that search's state information indicates it would have been complete once that file was done, which is exactly what it says.

The search has no way to know you started it but decided to not continue it by manually moving to that earlier file by closing the current tab, effectively telling the search to skip all the intervening matches. Normally when manually hopping around you do need to restart a search if nothing in search terms changes (find, replace, target, conditions).

The file you start a search in matters as it determines the ending state (file). You would need to use the Restart Search button to handle a new search starting from this earlier position so a new search state is set because you did not change the search terms at all. When doing a search sequence one by one where you start the search effectively determines the ending point.

So it was not count or Dry-run that is the issue which is why I could not see what you are seeing.

Still not sure what the "save as" is contributing to this.

So if you want to visit every instance of the search in an epub exactly once in a one by one basis, you start a new search anywhere you want and follow the sequence of matches until it tells you it is done.

If you want to manually hop around from file to file reusing previously started searches with no changes then set your search target to the current file, or hit the restart search button each time so that starting point no no longer matters ... just like old Sigil that drove me crazy revisiting matches I had already decided not to change and forcing me to try to remember my starting point. Old Sigil also made running sequences of saved searches in a search group almost worthless. Which is why it was changed awhile back.

I will look into adding a shortcut for restarting a search to the next release to make it easier to do that.

But learning to set a starting search any place you want and following it one by one until it tells you are done is a much more efficient way to perform search and replace helping to prevent omissions and makes SavedSearch groups more useful.
KevinH is offline   Reply With Quote
Old 05-11-2025, 01:29 PM   #23
ElMiko
Fanatic
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 500
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by KevinH View Post
Still not sure what the "save as" is contributing to this.
Honestly, I've been doing this for so long that it actually took me a moment to remember/reconstruct why I do this in that order...

Basically, where there are only a few (less than 10) matches for a given search or where it's a simple bulk replace, I'll cycle through the document, not caring where in the document the search starts. These searches (with limited matches) tend to be pretty precise, and have minimal input/output variance. However, as soon as I load a search that has a lot of matches (verified by Count All) and non-trivial chance for false-positives, I'll save my work and restart at the beginning of the doc because a) it'll have been a minute since my last save, given that I was cycling through limited or simple searches, and b) because I'm going to be dealing with a more complex pattern of matches and more of them, it's easier for me to make a mistake and need to backtrack. Regarding (b), if the search hasn't been cycling through files in order and strictly as needed (but rather is cycling from a random starting point, with multiple irrelevant open files), if I erroneously replace a value, it'll be very difficult to go back and find the error if the search has moved to a new file. (The error will have occurred in the last opened file, but that file wouldn't necessarily be the preceding tab if I already had a bunch of files open before I started the search).

Anyway, appreciate you're looking into adding the Refresh shortcut functionality. I'm also going to work on really availing myself of the new Sigil's sequence-tracking. Frankly, it's already a time-saver on the the aforementioned "simple" searches, as it prevents my matching the same strings multiple times after cycling through the document.
ElMiko is offline   Reply With Quote
Old 05-11-2025, 08:08 PM   #24
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 9,062
Karma: 6361556
Join Date: Nov 2009
Device: many
Okay, I just pushed to master a new Menu item called "Restart Current Search" that can be assigned a keyboard shortcut by the user.

This will appear n the next release of Sigil.
KevinH is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
to merge an advanced search and a shortcut into one command reinsley Library Management 0 12-02-2016 09:54 AM
How Can I Create A Keyboard Shortcut To Toggle Search Highlighting? copyrite Calibre 7 10-16-2014 05:12 PM
restore previous search box entry on restart cybmole Calibre 2 11-22-2011 04:07 AM
after restart calibre the search is gone salines Calibre 2 11-15-2011 02:34 AM
shortcut for direct dictionary search? shinew Amazon Kindle 9 03-04-2009 05:14 AM


All times are GMT -4. The time now is 03:45 PM.


MobileRead.com is a privately owned, operated and funded community.