View Single Post
Old 06-27-2024, 09:20 AM   #19
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,914
Karma: 6120478
Join Date: Nov 2009
Device: many
You seem to misunderstand.

The epub3 spec did not remove it, they focused it on urls and file paths where it matters most for string comparison and matching for the epub links to function at all.

In order for search to work in general, all of the text being searched and all of the search strings need to form text characters from unicode using an identical sequence. If more than 1 sequence creates the same character, the mixing them in the same text causes problems as does searching with one sequence in find but the text uses the other sequence for the same word or words.

And in either form, *no* text is ever lost or unreadable. It will be 100% searchable only when the search string and all the text to be searched are in the same form.

As for Sigil, and Calibre going away, the code to convert a file between normalization forms in python is trivial (less than 5 lines) same in Qt and it is available in all good string libraries.

And finally, the world seems to be standardizing on NFC forms for just the reasons above.

So no worries.

Last edited by KevinH; 06-27-2024 at 09:25 AM.
KevinH is online now   Reply With Quote